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PRODUCTION OF SYRINGYL LIGNIN IN GY MNOSPFRMS 
Field of the Invention 

This application claims the benefit of U.S. Provisional Application number 60/033,381, filed 
December 16, 1996. The invention relates to the molecular modification of gymnosperms in order to 
5 cause the production of syringyl units during lignin biosynthesis and to production and propagation of 
gymnosperms containing syringyl lignin. 
Background of the Invention 

Lignin is a major part of the supportive structure of most woody plants including angiosperm 
and gymnosperm trees which in turn are the principal sources of fiber for making paper and cellulosic 
10 products. In order to liberate fibers from wood structure in a manner suitable for making many grades 
of paper, it is necessary to remove much of the lignin from the fiber/lignin network. Lignin is removed 
from wood chips by treatment of the chips in an alkaline solution at elevated temperatures and pressure 
in an initial step of papermaking processes. The rate of removal of lignin from wood of different tree 
species varies depending upon lignin structure. Three different lignin structures have been identified in 
15 trees: p-hydroxyphenyl, guaiacyl and syringyl, which are illustrated in Fig. l. 

Angiosperm species, such as Liquidambar styraciflua L. [sweetgumj, have lignin composed of a 
mixture of guaiacyl and syringyl monomer units. In contrast, gymnosperm species such as Pinus taeda 
L. [loblolly pine] have lignin which is devoid of syringyl monomer units. Generally speaking, the rate 
of delignification in a pulping process is directly proportional to the amount of syringyl lignin present in 
20 the wood. The higher delignification rates associated with species having a greater proportion of 

syringyl lignin result in more efficient pulp mill operations since the mills make better use of energy 
and capital investment and the environmental impact is lessened due to a decrease in chemicals used for 
delignification. 

It is therefore an object of the invention to provide gymnosperm species which are easier to 
25 delignify in pulping processes. 

Another object of the invention is to provide gymnosperm species such as loblolly pine which 
contain syringyl lignin. 

An additional object of the invention is to provide a method for modifying genes involved in 
lignin biosynthesis in gymnosperm species so that production of syringyl lignin is increased while 
30 production of guaiacyl lignin is suppressed. 

Still another object of the invention is to produce whole gymnosperm plants containing genes 
which increase production of syringyl lignin and repress production of guaiacyl lignin. 

Yet another object of the invention is to identify, isolate and/or clone those genes in 
angiosperms responsible for production of syringyl lignin. 
35 A further object of the invention is to provide, in gymnosperms, genes which produce syringyl 
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lignin. 

Another object of the invention is to provide a method for making an expression cassette 
insertable into a gymnosperm cell for the purpose of inducing formation of syringyl lignin in a 
gymnosperm plant derived from the cell. 
5 Definitions 

The term "promoter" refers to a DNA sequence in the 5* flanking region of a given gene which 
is involved in recognition and binding of RNA polymerase and other transcriptional proteins and is 
required to initiate DNA transcription in cells. 

The term "constitutive promoter" refers to a promoter which activates transcription of a desired 
10 gene, and is commonly used in creation of an expression cassette designed for preliminary experiments 
relative to testing of gene function. An example of a constitutive promoter is 35S CaMV, available 
from Clonetech. 

The term "expression cassette" refers to a double stranded DNA sequence which contains both 
promoters and genes such that expression of a given gene is acheived upon insertion of the expression 
15 cassette into a plant cell. 

The term "plant" includes whole plants and portions of plants, including plant organs (e.g. 
roots, stems, leaves, etc.) 

The term "angiosperm" refers to plants which produce seeds encased in an ovary. A specific 
example of an angiosperm is Uquidambar styraciflua (L.)[sweetgum], The angiosperm sweetgum 
20 produces syringyl lignin. 

The term "gymnosperm" refers to plants which produce naked seeds, that is, seeds which are 
not encased in an ovary. A specific example of a gymnosperm is Pinus taeda (L.)[tobIolly pine]. The 
gymnosperm loblolly pine does not produce syringyl lignin. 
Summary of th e Invention 

25 With regard to the above and other objects, the invention provides a method for inducing 

production of syringyl lignin in gymnosperms and to gymnosperms which contain syringyl lignin for 
improved delignification in the production of pulp for papermaking and other applications. In 
accordance with one of its aspects, the invention involves cloning an angiosperm DNA sequence which 
codes for enzymes involved in production of syringyl lignin monomer units, fusing the angiosperm 

30 DNA sequence to a lignin promoter region to form an expression cassette, and inserting the expression 
cassette into a gymnosperm genome. 

Enzymes required for production of syringyl lignin in an angiosperm are obtained by deducing 
an amino acid sequence of the enzyme, extrapolating an mRNA sequence from the amino acid 
sequence, constructing a probe for the corresponding DNA sequence and cloning the DNA sequence 

35 which codes for the desired enzyme. A promoter region specific to a gymnosperm lignin biosynthesis 
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gene is identified by constructing a probe for a gymnosperm lignin biosynthesis gene, sequencing the 3 - 
nanking region of the DNA which encodes the gymnosperm lignin biosynthesis gene to locate a 
promoter sequence, and then cloning that sequence. 

An expression cassette is constructed by fusing the angiosperm syringyl lignin DNA sequenc£ 
.o the gymnosperm promoter DNA sequence. AUernative.y, the angiosperm syringy, Hgnin DNA is 
(used to a constitutive promoter to form an expression cassette. The expression cassette is inserted into 
*e gymnosperm genome to transform the gymnosperm genome. Cells containing the transformed 
genome are selected and used to produce a transformed gymnosperm plant containing syringyl , ignin 
In accordance with the invention, the angiosperm gene sequences bi-OMT 4CL P450 1 and 
P450-2 have been determined and isolated as associated with production of syringy. ,i gmn in sweetgum 
and hgnm promoter regions for the gymnosperm loblolly pine have been determined to be the 5' 
flanking regions for the 4CL1B, 4CL3B and PAL gymnosperm H gni „ gene , Expression cassettes 
containing sequences of selected genes from sweetgum have been inserted into loblolly pine 
embryogenic cells and presence of sweetgum genes associated with production of syringyl lign i„ has 
been confirmed in daughter cells of the resulting loblolly pine embryogenic cells. 

The invention therefore enables production of gymnosperms such as loblolly pi„ e containing 
genes wh.ch code for production of syringyl U S m, to thereby produce in such species syringy, , ignin in 
the wood structure for enhanced pulpability. 
Brief Desrriminn nf ,^ Drawing 

The above and other aspects of the invention will now be further described in the following 
detatled specification considered in conjunction with the following drawings in which: 
Fig. 1 illustrates a generalized pathway for lignin synthesis; and 

Fig. 2 illustrates a bifunctional-O-methyl transferase (bi-OMT) gene sequence involved in the 
production of syringyl lignin in an angiosperm (SEQ ID 5 and 6); 

Fig. 3 illustrates a 4-coumarate CoA ligase ( 4CL) gene sequence involved in the production of 
syringyl lignin in an angiosperm (SEQ ID 7 and 8); 

Fig. 4 illustt-ates a P450-1 gene sequence involved in the production of syringyl lignin in an 
angiosperm (SEQ ID 1 and 2); 

Fig. 5 iMustrates a P450-2 gene sequence involved in the production of syringyl lignin in an 
angiosperm (SEQ ID 3 and 4); 

Fig. 6 illustrates nucleotide sequences of the 5' flanking region of the loblolly pine 4CL1B gene 
showmg the location of regulatory elements for lignin biosynthesis (SEQ ID 10); 

Fig. 7 illustrates nucleotide sequences of the 5" flanking region of the loblolly pine 4CL3B gene 
showing the location of regulatory elements for lignin biosynthesis (SEQ ID II); 

Fig. 8 illustrates nucleotide sequences of the 5" flanking region of loblolly pine PAL gene 
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showing the location of regulatory elements for lignin biosynthesis (SEQ ID 9); 

Fig. 9 illustrates a PCR confirmation of the sweeigum P450-1 gene sequence in transgenic 
loblolly pine cells. 

pptailed Descriptio n o f the Invention 

5 In accordance with the invention, a method is provided for modifying a gymnosperm genome, 

such as the genome of a loblolly pine, so that syringyl lignin will be produced in the resulting plant, 
thereby enabling cellulosic fibers of the same to be more easily separated from lignin in a pulping 
process. In general, this is accomplished by fusing one or more angiosperm DNA sequences (referred 
to at times herein as the "ASL DNA sequences") which are involved in production of syringyl lignin to 

10 a gymnosperm lignin promoter region (referred to at times herein as the "GL promoter region") specific 
to genes involved in gymnosperm lignin biosynthesis to form a gymnosperm syringyl lignin expression 
cassette (referred to at times herein as the "GSL expression cassette"). Alternatively, the one or more 
ASL DNA sequences are fused to one or more constitutive promoters to form a GSL expression 
cassette. 

15 The GSL eX p res sion cassette preferably also includes selectable marker genes which enable 

transformed cells to be differentiated from untransformed cells. The GSL expression cassette 
containing selectable marker genes is inserted into the gymnosperm genome and transformed cells are 
identified and selected, from which whole gymnosperm plants may be produced which exhibit 
production of syringyl lignin. 
20 To suppress production of less preferred forms of lignin in gymnosperms, such as guaiacyl 

lignin, genes from the gymnosperm associated with production of these less preferred forms of lignin 
are identified, isolated and the DNA sequence coding for anti-sense mRNA (referred to at times herein 
as the "GL anti-sense sequence") for these genes is produced. The DNA sequence coding for anti-sense 
mRNA is then incorporated into the gymnosperm genome, which when expressed bind to the less 
25 preferred guaiacyl gymnosperm lignin mRNA, inactivating it. 

Further features of these and various other steps and procedures associated with practice of the 
invention will now be described in more detail beginning with identification and isolation of ASL DNA 
sequences of interest for use in inducing production of syringyl lignin in a gymnosperm. 
I Determination of nNA Sequen t Far Penes Associated with Production of SvringYl Lignin 
30 The general biosynthetic pathway for production of lignin has been postulated as shown in Fig. 

1 . From Fig. I , it can be seen that the genes CCL, OMT and F5H (which is from the class of P450 
genes) may play key roles in production of syringyl lignin in some plant species, but their specific 
contributions and mechanisms remain to be positively established. It is suspected that the CCL, OMT 
and F5H genes may have specific equivalents in a specific angiosperm, such as sweetgum. 
35 Accordingly, one aim of the present invention is to identify, sequence and clone specific genes of 
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interest from an angiosperm such as sweetgum which are involved in production of syringyl lignin and 
to then introduce those genes into the genome of a gymnosperm, such as loblolly pine, to induce 
production of syringyl lignin. 

Genes of interest may be identified in various ways, depending on how much information about 
5 the gene is already known. Genes believed to be associated with production of syringyl lignin have 
already been sequenced from a few angiosperm species, viz, CCL and OMT. 

DNA sequences of the various CCL and OMT genes are compared to each other to determine, if 
there are conserved regions. Once the conserved regions of the DNA sequences are identified, primers 
homologous to the conserved sequences are synthesized. Reverse transcription of the DNA-free total 

10 RNA which was purified from sweetgum xylem tissue, followed by double PCR using gene-specific 
primers, enables production of probes for the CCL and OMT genes. 

A sweetgum cDNA library is constructed in a host, such as lambda ZAPI1, available from 
Stratagene, of LaJolia, CA, using poly(A) +RNA isolated from sweetgum xylem, according to the 
methods described by Bugos et al. (1995 Biotechniques 19:734-737). The above mentioned probes are 

15 used to assay the sweetgum cDNA library to locate cDNA which codes for enzymes involved in 
production of syringyl lignin. Once a syringyl lignin sequence is located, it is then cloned and 
sequenced according to known methods which are familiar to those of ordinary skill. 

In accordance with the invention, two sweetgum syringyl lignin genes have been determined 
using the above-described technique. These genes have been designated 4CL and bi-OMT. The 

20 sequence obtained for the sweetgum syringyl lignin gene, designated bi-OMT, is illustrated in Fig. 2 
(SEQ ID 5 and 6). The sequence obtained for the sweetgum syringyl lignin gene, designated 4CL, is 
illustrated in Fig. 3 (SEQ ID 7 and 8). 

An alternative procedure was employed to identify the F5H equivalent genes in sweetgum. " 
Because the DNA sequences for similar P450 genes from other plant species were known, probes for 

25 the P450 genes were designed based on the conserved regions found by comparing the known 

sequences for similar P450 genes. The known P450 sequences used for comparison include all plant 
P450 genes in the GenBank database. Primers were designed based on two highly conserved regions 
which are common to all known plant P450 genes. The primers were then used in a PCR reaction with 
the sweetgum cDNA library as a template. Once P450-Hke fragments were located, they were 

30 amplified using standard PCR techniques, cloned into a pBluescript vector available from Stratagene of 
Lalolla, CA and transformed into a DH5ct E. colt strain available from Gibco BRL of Gaithersburg, 
MD. 

After E. coli colonies were tested in order to determine that they contained the P450-like DNA 
fragments, the fragments were sequenced. Several P450-like sequences were located in sweetgum 
35 using the above described technique. One P450-like sequence was sufficiently different from other 
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known P450 sequences to indicate that it represented a new P450 gene family. This potentially new 
P450 cDNA fragment was used as a probe to screen two full length clones from the sweetgum xylem 
cDNA library. These putative hydroxylase clones were designated P450-1 and P450-2. The sequences 
obtained for P450-1 and P450-2 are illustrated in Fig. 4 (SEQ ID 1 and 2) and Fig. 5 (SEQ ID 3 and 
5 4). 

II. Identification of GI. dene Promoter Region 

In order to locate gymnosperm lignin promoter regions, probes are developed to locate lignin 
genes. After the gymnosperm lignin gene is located, the portion of DNA upstream from the gene is 
sequenced, preferably using the GenomeWalker Kit. available from Clontech. The portion of DNA 

10 upstream from the lignin gene will generally contain the gymnosperm lignin promoter region. 

Gymnosperm genes of interest include CCL-like genes and PAL-like genes, which are beleived 
to be involved in the production of lignin in gymnosperms. Preferred probe sequences are developed 
based on previously sequenced genes, which are available from the gene bank. The preferred gene 
bank accession numbers for the CCL-like genes include U39404 and U39405. A preferred gene bank 

15 accession number for a PAL-like gene is U39792. Probes for such genes are constructed according to 
methods familiar to those of ordinary skill in the art. A genomic DNA library is constructed and DNA 
fragments which code for gymnosperm lignin genes are then identified using the above mentioned 
probes. A preferred DNA library is obtained from the gymnosperm, Pinus taeda (L.)[Loblolly Pine], 
and a preferred host of the genomic library is Lambda Dashll, available from Stratagene of LaJolla. 

20 CA. 

Once the DNA fragments which code for the gymnosperm lignin genes are located, the 
genomic region upstream from the gymnosperm lignin gene (the 5'flanking region) was identified. This 
region contains the GL promoter. Three promoter regions were located from gymnosperm lignin 
biosynthesis genes. The first is the 5'flanking region of the loblolly pine 4CL1B gene, shown in Fig. 6 
25 (SEQ ID 10). The second is the 5' flanking region of the loblolly pine gene 4CL3B, shown in Fig. 7 
(SEQ ID 11). The third is the 5* flanking region of the loblolly pine gene PAL, shown in Fig. 8 (SEQ 
ID 9). 

III. Fusing the GL Promoter Region to the A. ST, DNA Sftqnenre 

The next step of the process is to fuse the GL promoter region to the ASL DNA sequence to 
30 make a GSL expression cassette for insertion into the genome of a gymnosperm. This may be 

accomplished by standard techniques. In a preferred method, the GL promoter region is first cloned 
into a suitable vector. Preferred vectors are pGEM7Z, available from Promega. Madison, WI and SK 
available from Stratagene, of LaJolla, CA. After the promoter sequence is cloned into the vector, it is 
then released with suitable restriction enzymes. The ASL DNA sequence is released with the same 
35 restriction enzyme(s) and purified. 
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The GL promoter region sequence and the ASL DNA sequence are then ligated such as with T4 
DNA ligase, available from Promega, to form the GSL expression cassette. Fusion of the GL and ASL 
DNA sequence is confirmed by restriction enzyme digestion and DNA sequencing. After confirmation 
of GL promoter-ASL DNA fusion, the GSL expression cassette is released from the original vector with 
suitable restriction enzymes and used in construction of vectors for plant transformation. 
IV, Fiisinp thr ASL DNA .Sequence to a Constitutive Pmmn n 

In an alternative embodiment, a standard constitutive promoter may be fused with the ASL 
DNA sequence to make a GSL expression cassette. For example, a standard constitutive promoter may 
be fused with P450-1 to form an expression cassette for insertion of P450-1 sequences into a ' 
gymnosperm genome. In addition, a standard constitutive promoter may be fused with P450-2 to form 
an expression cassette for insertion of P450-2 into a gymnosperm genome. A constitutive promoter for 
use in the invention is the double 35S promoter. 

In the preferred practice of the invention using constitutive promoters, a suitable vector such as 
pBI221, is digested by Xbal and Hindlll to release the 35S promoter. At the same time the vector 
pHygro, available from International Paper, was digested by Xbal and HindHI to release the double 35S V 
promoter. The double 35S promoter was ligated to the previously digested pBI221 vector to produce a - 
new pBI221 with the double 35S promoter. This new pBI221 was digested with Sad and Smal, to 
release the GUS fragment. The vector is next treated with T4 DNA polymerase to produce blunt ends 
and the vector is self-ligated. This vector is then further digested with BamHl and Xbal, available from 
Promega: After the pBI221 vector containing the constitutive promoter region has been prepared, 
lignin gene sequences are prepared for insertion into the pBI22l vector. 

The coding regions of sweetgum P450-1 or P450-2 are amplified by PCR using primer with 
restriction sites incorporated in the 5' and 3* ends. In one example, an Xbal site was incorporated at 
the 5' end and a BamHI site was incorporated at the 3' end of the sweetgum P450-1 or P450-2 genes. 
After PCR, the P450-1 and P450-2 genes were separately cloned into a TA vector available from 
Invitrogen. The TA vectors containing the P450-1 and P450-2 genes, respectively, were digested by 
Xbal and BamHI to release the P450-1 or P450-2 sequences. 

The p35SS vector, described above, and the isolated sweetgum P450-1 or P450-2 fragments 
were then ligated to make GLS expression cassettes containing the constitutive promoter. 
30 V . InsertinE the Expression Cassette into the, Cvmp o sperm Onnny 

There are a number of methods by which the GSL expression cassette may be inserted into a 
target gymnosperm cell. One method of inserting the expression cassette into the gymnosperm is by 
micro-projectile bombardment of gymnosperm cells. For example, embryogenic tissue cultures of 
loblolly pine may be initiated from immature zygotic embryos. Tissue is maintained in an 
undifferentiated state on semi-solid proliferation medium. For transformation, embryogenic tissue is 
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suspended in liquid proliferation medium. Cells are then sieved through, a preferably 40 mesh screen, 
to separate small, densely cytoplasmic celts from large vacuolar cells. 

After separation, a ponton of the liquid cell suspension fraction is vacuum deposited onto filter 
paper and placed on semi-solid proliferation medium. The prepared gymnosperm target cells are then 

5 grown for several days on filter paper discs in a petri dish. 

A 1:1 mixture of plasmid DNA containing the selectable marker expression cassette and 
plasmid DNA containing the P450-1 expression cassette may be precipitated with gold to form 
microprojectiles. The microprojectiles are rinsed in absolute ethanol and aliqots are dried onto a 
suitable macrocarrier such as the macrocarrier available from BioRad in Hercules, CA. 

10 Prior to bombardment, embryogenic tissue is preferably desiccated under a sterile laminar-flow 

hood. The desiccated tissue is transferred to semi-solid proliferation medium. The prepared 
microprojectiles are accelerated from the macrocarrier into the desiccated target celts using a suitable 
apparatus such as a BioRad PDS-1000/HE particle gun. In a preferred method, each plate is 
bombarded once, rotated 180 degrees, and bombarded a second time. Preferred bombardment 

15 parameters are 1350 psi rupture disc pressure, 6 mm distance from the rupture disc to macrocarrier 
(gap distance), 1 cm macrocarrier travel distance, and 10 cm distance from macrocarrier stopping 
screen to culture plate (microcarrier travel distance). Tissue is then transferred to semi-solid 
proliferation medium containing a selection agent, such as hygromycin B, for two days after 
bombardment. 

20 Other methods of inserting the GSL expression cassette include use of silicon carbide whiskers, 

transformed protoplasts, Agrobacterium vectors and electroporation. 
VI. Identify ing Transformed Cells 

In general, insertion of the GSL expression cassette will typically be carried out in a mass of 
cells and it will be necessary to determine which cells harbor the recombinant DNA molecule 

25 containing the GSL expression cassette. Transformed cells are first identified by their ability to grow 
vigorously on a medium containing an antibiotic which is toxic to non-transformed cells. Preferred 
antibiotics are kanamycin and hygromycin B. Cells which grow vigorously on antibiotic containing 
medium are further tested for presence of either portions of the plasmid vector, the syringyl lignin 
genes in the GSL expression cassette; e.g. the angiosperm bi-OMT, 4CL, P450-1 or P450-2 gene, or 

30 by testing for presence of other fragments in the GSL expression cassette. Specific methods which can 
be used to test for presence of portions of the GSL expression cassette include Southern blotting with a 
labeled complementary probe or PCR amplification with specific complementary primers. In yet 
another approach, an expressed syringyl lignin enzyme can be detected by Western blotting with a 
specific antibody, or by assaying for a functional property such as the appearance of functional 

35 enzymatic activity. 
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VII . FrodUClion Of a CTYmnOSPCrm Plant from the Transfor m ed fivmnmpP^ r »[| 

Once transformed embryogenic cells of the gymnosperm have been identified, isolated and 

multiplied, they may be grown into plants. It is expected that all plants resulting from transformed cells 

will contain the GSL expression cassette in all their cells, and that wood in the secondary growth stage 

of the mature plant will be characterized by the presence of syringyl lignin. 

Transgenic embryogenic cells are allowed to replicate and develop into a somatic embryo, 

which are then converted into a somatic seedling. 

VH I, Ident i fication . Product i on and Insertion o f a GL m RNA Anri.^nse s^ nr* 

In addition to adding ASL DNA sequences, anti-sense sequences may be incorporated into a 
gymnosperm genome, via GSL expression cassettes, in order to suppress formation of the less preferred 
native gymnosperm lignin. To this end, the gymnosperm lignin gene is first located and sequenced in 
order to determine its nucleotide sequence. Methods for locating and sequencing amino acids which 
have been previously discussed may be employed. For example, if the gymnosperm lignin gene has 
already been purified, standard sequencing methods may be employed to determine the DNA nucleic 
acid sequence. 

If the gymnosperm lignin gene has not been purified and functionally similar DNA or mRNA • 
sequences from similar species are known, those sequences may be compared to identify highly 
conserved regions and this information used as a basis for the construction of a probe. A gymnosperm 
cDNA or genomic library can be probed with the above mentioned sequences to locate the gymnosperm 
lignin cDNA or genomic DNA. Once the gymnosperm lignin DNA is located, it may be sequenced 
using standard sequencing methods. 

After the DNA sequence has been obtained for a gymnosperm lignin sequence, the 
complementary anti-sense strand is constructed and incorporated into an expression cassette. For 
example, the GL mRNA anti-sense sequence may be fused to a promoter region to form an expression .< 
cassette as described above. In a preferred method, the GL mRNA anti-sense sequence is incorporated v 
into the previously discussed GSL expression cassette which is inserted into the gymnosperm genome as 
described above. 

IX , Inc l usion of CvtorhrPmr P45Q ReductRse fCPRl to Fnhanr e Biosynthesis nf svrinpvl T ip njn in 

In the absence of external cofactors such as NADPH (an electron donor in reductive 
biosymheses), certain angiosperm lignin genes such as the P450 genes may remain inactive or not 
acheive full or desired activity after insertion into the genome of a gymnosperm. Inactivity or 
insufficient activity can be determined by testing the resulting plant which contains the P450 genes for 
the presence of syringyl lignin in secondary growth. It is known that cytochrome P450 reductase (CPR) 
may be involved in promoting certain reductive biochemical reactions, and may activate the desired 
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expression of genes in many plants. Accordingly, if it is desired to enhance the expression of the 
angiosperm syringyl ligntn genes in the gymnosperm, CPR may be inserted in the gymnosperm 
genome. In order to express CPR, the DNA sequence of the enzyme is ligated to a constitutive 
promoter or, for a specific species such as loblolly pine, xylem-specific lignin promoters such as PAL, 
5 4CL1B or 4CL3B to form an expression cassette. The expression cassette may then be inserted into the 
gymnosperm genome by various methods as described above. 
X. Examples 

The following non-limiting examples illustrate further aspects of the invention. In these 
examples, the angiosperm is Uquidambar styraciflua (L.)[sweetgum] and the gymnosperm is Pinus 
10 taeda (L.)[k>blolly pine]. The nomenclature for the genes referred to in the examples is as follows: 
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Biochemical Name 


4CL (angiosperm) 


4-coumarate CoA ligase 


bi-OMT (angiosperm) 


bifunctional-O-methyl transferase 


P450-1 (angiosperm) 


cytochrome P450 


P450-2 (angiosperm) 


cytochrome P450 


PAL (gymnosperm) 


phenylalanine ammonia-lyase 


4CL1B (gymnosperm) 


4-coumarate CoA ligase 


4CL3B (gymnosperm) 


4-coumarate CoA ligase 


Example 1 - Isolating and Sequencing bi-OMT and 4CL Genes from an Angiosperm 



A cDNA library for Sweetgum was constructed in Lambda ZAPII, available from Stratagene, 
of LaJolla, CA, using poIy(A) + RNA isolated from Sweetgum xylem tissue. Probes for bi-OMT and 
4CL were obtained through reverse transcription of their mRNAs and followed by double PCR using 
25 gene-specific primers which were designed based on the OMT and CCL cDNA sequences obtained 
from similar genes cloned from other species. 

Four primers were used for amplifying OMT fragments. One was an oligo-dT primer. One 
was a bi-OMT primer, (which was used to clone gene fragments through modified differential display 
technique, as described below in Example 2) and the other two were degenerate primers, which were 
30 based on the conserved sequences of all known OMTs. The two degenerate primers were derived 
based on the following amino acid sequences: 

5'- Gly Gly Met Ala Thr Tyr Cys Cys Ala Thr Thr Tyr Ala Ala Cys Ala Ala Gly Gly Cys-3' 
(primer #22) and 

3* -Ala Ala Ala Gly Ala Gly Ala Gly Asn Ala Cys Asn Asn Ala Asn Asn Ala Asn Gly Ala-5* 
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(primer #23). 

A 900 bp PCR product was produced when oligo-dT primer and primer #22 were used, and a 
550 bp fragment was produced when primer numbers 22 and 23 were used. 

Three primers were used for amplifying CCL fragments. They were derived from the 
following amino acid sequences: 

5'-Thr Thr Gly Gly Ala Thr Cys Cys Gly Gly lie Ala Cys lie Ala Cys lie Gly Gly He Tyr Thr 
He Cys Cys He Ala Ala Arg Gly Gly-3* (primer R1S) 

5'-Thr Thr Gly Gly Ala Thr Cys Cys Gly Thr He Gly Thr He Gly Cys He Cys Ala Arg Cys 
Ala Arg Gly Thr He Gly Ala Tyr Gly GIy-3* (primer HIS) and 

3'-Cys Cys He Cys Thr Tyr Thr Ala Asp Ala Cys Arg Thr Ala Asp Gly Cys He Cys Cys Ala 
Gly Cys Thr Gly Thr Ala-5' (primer R2A) 

R1S and HIS were both sense primers. Primer R2A was an anti-sense primer. A 650 bp 
fragment was produced if R1S and R2A primers were used and a 550 bp fragment was produced when 
primers HIS and R2A were used. The sequence of these three primers were derived from conserved 
sequences for plant CCLs. 

The reverse transcription-double PCR cloning technique used for these examples consisted of 
adding 10 /ig of DNA-free total RNA in 25/U DEPC-treated water to a microfuge rube. Next, the 
following solutions were added: 

a. 5x Reverse transcript buffer 8.0jil, 

b. 0.1 M DTT4.0/tl 

c. 10 mM dNTP 2.0 jd 

d. 100 fiM oligo-dT primers 8.0 ^1 

e. Rnasin 2.0^1 

f. Superscript II 1 .0 p\ 

After mixing, the tube was incubated at a temperature of 42° C for one (1) hour, followed by 
incubation at 70° C for fifteen (15) minutes. Forty (40) p\ of IN NaOH was added and the tube was 
further incubated at 68° C for twenty (20) minutes. After the incubation periods, 80 *d of IN HC1 was 
added to the reaction mixture. At the same time, 17 fd NaOAc, 5 pi glycogen and 768 ^1 of 100% 
ethanol were added and the reaction mixture was maintained at -80* C for 15 minutes in order to 
precipitate the cDNA. The precipitated cDNA was centrifuged at high speed at 4° C for 15 minutes. 
The resulting pellet was washed with 70% ethanol and then dried at room temperature, and then was 
dissolved in 20.^1 of water. 

The foregoing procedure produced purified cDNA which was used as a template to carry out 
first round PCR using primers #22 and oligo-dT for cloning OMT cDNA and primer R IS and R2A for 
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Coning 4CL cDNA. For the first round PGR. a master mix of 5<*l for each reaction was prepared 

Each 50/xl mixture contained: 

a. 1 Ox buffer 5fi\ 

b. 25 mM MgClj 5^1 

c. 100 pM sense primer i/xl (primer #22 for OMT and primer RIS for CCL) 

c." lTn^XT S pTX imer 1 ( ° Iifi °" dT PrimCr f ° r ° MT and ^ A f0r CCL >" 
f- Taq. DNA polymerase 0.5 /xl 

Of ftis master mix. 48 M l was added into a PGR tube containing 2 4 of cDNA for PGR The 
<ube was heated ,0 95- C for 45 seconds. 52'C for one minute and 72' C for two minutes This 
temperature cycle was repeated for 40 cycles and the mixture was men he.d a. 72» C for !0 minutes 

The cDNA fragments obtained from the first round of PGR were used as templates to perform 
the second round of PGR using primers 22 and 23 for Coning bi-OMT cDNA and primer HIS and R2A 
for Coning 4CL cDNA. The second round of PGR conditions were the same as the first round 

The desired cDNA fragment was then sub-cloned and sequenced. After the second round of 
PCR. the product with the predicted size was excised from the gel and ligated into a pUCI9 vector 
avat.able from Clonetech. of Palo Alto, CA. and then transformed into DH5o, an E . coli strain 
avatlable from Gibco BRL. of Gaithersburg. MD. After the inserts had been checked for correct' size 
.he colonies were isolated and plasmids were sequenced using a Sequenase ki, available from USB. of' 
Cleveland, OH. The sequences are shown in Fig. 2 (SEQ ID 5 and 6) and Fig. 3 (SEQ ID 7 and 8). 

E xample 2 - A l tcmativp isolation Mwhn d nf Anpjffl j ryrm hi-OMJ fftnr 

As previously mentioned, one bi-OMT Cone was produced via modified differential display 
techmque. This method is another type of reverse transcription-PCR. in which DNA-free total RNA 
was reverse transcribed using oligo-dT primers with a single base pair anchor to form cDNA. The 
ohgo-dT primers used for reverse transcription of mRNA to synthesize cDNA were- 

Til A: TT 1 1 1 1 1 ITIT A, 

T11C: 111111 11T1TC, and 

TUG: TTTTTTTTTTTG, 

These cDNAs were then used as templates for radioactive PCR which was conducted in the 
presence of the same o.igo^lT primers as listed above, a bi-OMT gene-specific primer and 35S-dATP 
The OMT gene-specific primer was derived from the following amino acid sequence: S'-Cys Cys Asn 
Gly Gly Asn Gly Gly Ser Ala Arg Gly Ala-3\ 

The following PCR reaction solutions were combined in a microfuge tube: 

a. Hj0 9.2/il, 

b. Taq Buffer 2.0/xl 

c. dNTP(25jiM) 1.6>1 
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d. Primers (5 pM) 2 nl for each primer 

e. "S-dATP lfi\ 

f. Taq. pol. 0.2^1 

g. cDNA 2.0^1. 

The tube was heated to a temperature of 94° C and held for 45 seconds, then at 37° C for 2 
minutes and then 72 °C for 45 seconds for forty cycles, followed by a final reaction at 72 °C for 5 
minutes. 

The amplified products were fractionated on a denaturing polyacrylamide sequencing gel and 
autoradiography was used to identify and excise the fragments with a predicted size. The designed 
OMT gene-specific primer had a sequence conserved in a region toward the 3'-end of the OMT cDNA 
sequence. This primer, together with oligo-dT, was amplified into a OMT cDNA fragment of about 
300 bp. 

Three oligo-dTs with a single base pair of A, C or G, respectively, were used to pair with the 
OMT gene-specific primer. Eight potential OMT cDNA fragments with predicted sizes of about 300 bp 
were excised from the gels after several independent PCR rounds using different combinations of 
oligo-dT and OMT gene-specific oligonucleotides as primers. 

The OMT cDNA fragments were then re-amplified. A Southern blot analysis was performed 
for the resulting cDNAs using a 360 base-pair, 32 P radio-isotope labeled, aspen OMT cDNA 3*-end 
fragment as a probe to identify the cDNA fragments having a strong hybridization signal, under low 
stringency conditions. Eight fragments were identified. Out of these eight cDNA fragments, three 
were selected based on their high hybridization signal for sub-cloning and sequencing. One clone, 
LsOMT3*-l , (where the "Ls" prefix indicates that the clone was derived from the Liquidambar 
styraciflua (L.) genome) was confirmed to encode bi-OMT based on its high homology to other 
lignin-specific plant OMTs at both nucleotide and amino acid sequence levels. 

A cDNA library was constructed in Lambda ZAP II, available from Stratagene, of LaJolla, 
CA, using 5 fig poly(A)+RNA isolated from sweetgum xylem tissue. The primary library consisting of 
approximately 0.7 x 10* independent recombinants was amplified and approximately 10 5 
plaque-forming-units (pfu) were screened using a homologous 550 base-pair probe. The hybridized 
filter was washed at high stringency (0.25 x SSC, 0.1% SDS, 65° C) conditions. The colony 
containing the bi-OMT fragment identified by the probe was eluted and the bi-OMT fragment was 
produced. The sequence as illustrated in Fig. 2 (SEQ ID 5 and 6) was obtained. 
Example 3 - Isolating and Producing the DNA w hich codes for the Anpiosperm P450-1 fipng 

In order to find putative P450 cDNA fragments as probes for cDNA library screening, a highly 
degenerated sense primer based on the amino acid sequence of 5'-Glu, GIu, Phe, Arg, Pro, Glu, Arg-3' 
was designed based on the conserved regions found in some plant P450 proteins. This conserved 
domain was located upstream of another highly conserved region in P450 proteins, which had an amino 
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acid sequence of 5'-Phe Gly Xaa Gly Xaa Xaa Cys Xaa Gl y -3\ This primer was synthesized with the 
incorporation of an Xbol restriction site to give a 26-base-pair oligomer. 

This primer and the oligo-dT-XhoI primer were then used to perform PCR reactions with the 
sweetgum cDNA library as a template. The cDNA library was constructed in Lambda ZAPII, available 
from Stratagene, of LaJolla, CA, using poly(A) + RNA isolated from Sweetgum xylem tissue. 
Amplified fragments of 300 to 600 bp were obtained. Because the designed primer was located 
upstream of the highly conserved P450 domain, this design distinguished whether the PCR products 
were P450 gene fragments depending on whether they contained the highly conserved amino acid 
domain. 

All the fragments obtained from the PCR reaction were then cloned into a P UC19 vector, 
available from New England Biolab. Beverly, MA, and transformed into a DH5a E. coli strain, 
available from Gibco BRL, of Gaithersburg, MD. 

Twenty-four positive colonies were obtained and sequenced. Sequence analysis indicated four 
groupings within the twenty-four colonies. One was C4H, one was an unknown P450 gene, and two 
did not belong to P450 genes. Homologies of P450 genes in different species are usually more than 
80%. Because the homologies between the P450 gene families found here were around 40%, the 
sequence analysis indicated that a new P450 gene family was sequenced. Moreover, since this P450 
cDNA was isolated from xylem tissue, it was highly probable that this P450 gene was P450-1. 

The novel sweetgum P450 cDNA fragment was used as a probe to screen a full length cDNA 
encoding for P450-1 . Once the P450-1 gene was located it was sequenced. The length of the P450-1 
cDNA is 1707 bp and it contains 45 bp of 5* non-coding region and 135 bp of 3' non-coding region. 
The deduced amino acid sequence also indicates that this P450 cDNA has a hydrophobic core at the 
N-terminal, which could be regarded as a leader sequence for c-translational targeting to membranes 
during protein synthesis. At the C-terminal region, there is a heme binding domain that is characteristic 
of all P450 genes. The P450-1 sequence, as illustrated in Fig. 4 (SEQ ID J and 2), was produced, 
according to the above described methods. 

Ex a mp l e 4 - Iso la ting and Producing the DNA which fa r the An f io ^rm P4SO-? rw 

By using similar strategy of synthesizing PCR primers from the published literature for 
hydroxylase genes in plants, another full length P450 cDNA has been isolated that shows significant 
similarity with a putitive F5H clone from Arabidopsis (Meyers et al. 1996: PNAS 93, 6869-6874). 
This cloned cDNA, designated P450-2, contains 1883 bp and encodes an open reading frame of 51 1 
amino acids. The amino acid similarity shared between Arabidopsis F5H and the P450-2 sweetgum 
clone is about 75 % . 

To confirm the function of the FA5H-2 gene, it was expressed in E.coli, strain, DH5 alpha, via 
pQE vector preparation, according to directions available with the kit. A CO-Fe 2+ binding assay was 
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also performed to confirm the expression of P450-2 as a functional P450 gene. (Omura & Sato 1964, 
y nf Biochemistry 239: 2370-2378, Babriac et.al. 1991 Archives of B i ortKmiMrv and B i ophysics 
288:302-309). The CO-Fe 2+ binding assay showed a peak at 450nm which indicates that P450-2 has 
been overexpressed as a functional P450 gene. 

The P450-2 protein was further purified for production of antibodies in rabbits, and antibodies 
have been successfully produced. In addition, Western blots show that this antibody is specific to the 
membrane fraction of sweetgum and aspen xylem extract. When the P450-2 antibody was added to a 
reaction mixture containing aspen xylem tissue, enzyme inhibition studies showed that the activity of 
FASH in aspen was reduced more than 60%. a further indication that P450-2 performs a P450-like 
function. Recombinant P450-2 protein co-expressed with Arabidopsis CPR protein in a baculovirus 
expression system hydroxylated ferulic acid (specific activity: 7.3 pKat/mg protein), cinnamic acid 
(specific activity: 25 pKat/mg protein), and p-coumaric acid (specific activity: 3.8 pKat/mg protein). 
The P450-2 enzyme which may be referred to as C4C3F5-H appears to be a broad spectrum 
hydroxylase in the phenylproponoid pathway in plants. Fig. 5 (SEQ ID 3 and 4) illustrates the P450-2 
sequence. 

PvamplP s . id f nrifamf fivmnosnfrm Promoter Regions 

In order to identify gymnosperm promoter regions, sequences from loblolly pine PAL and 
4CL1B and 4CL3B lignin genes were used as primers to screen the loblolly pine genomic library, using 
the GenomeWalker Kit. The loblolly pine PAL primer sequence was obtained from the GenBank, 
reference number U39792. The loblolly pine 4CL1B primer sequences were also obtained from the 
gene bank, reference numbers U39404 and U39405. 

The loblolly pine genomic library was constructed in Lambda Dashll. available from 
Stratagene. of LaJolla, CA. 3 x 10 6 phage plaques from the genomic library of loblolly pine were . 
screened using both the above mentioned PAL cDNA and 4CL (PCR clone) fragments as probes. Five 
4CL clones were obtained after screening. Lambda DNAs of two 4CL of the five 4CL clones obtained 
after screening were isolated and digested by EcoRV. PstI, Sail and Xbal for Southern analysis. 
Southern analysis using 4CL fragments as probes indicated that both clones for the 4CL gene were 
identical. Results from further mapping showed that none of the original five 4CL clones contained 
promoter regions. When tested, the PAL clones obtained from the screening also did not contain 
promoter regions. 

in a second attempt to clone the promoter regions associated with the PAL and 4CL a Universal 
GenomeWalker(TM) kit. available from Clontech, was used. In the process, total DNA from loblolly 
pine was digested by several restriction enzymes and ligated into the adaptors (libraries) provided with 
the kit. Two gene-specific primers for each gene were designed (GSPl and 2). After two rounds of 
PCR using these primers and adapter primers of the kit, several fragments were amplified from each 

-15- 



. WO 99/31243 

PCT/US98/26784 

library. A fragment and a M k „ for pAL ^ a 

0.7 kb fragment (4CL3B) for ^ 4CL gene were c(oned * J <CUB and a 

r eg ,o, for a„ (hree genes . SK Fjg . fi (S£Q , D , 0) ? (seq id i () ^ g (s£q ~ P-moter 

As a first step, a ASL DNA sequence, P450-1 was fused with * • • 

a P45<w expression — " £ p45 °-' — « 1 r 

were tn.ua ed from .mmature zyg 0tic embryos. The tissue was rnaintained in a „ undifferentiated state 
on sem-so.td pro.ifera.cn medium. according t0 methods ^ ^T h T, 
= :r ie^^ 

screen ^ SePara,i ° n ' ' " SUSPenSi ° n faata Which P— *. 40 mesh 

screen was vacuum deposited onto f.,ter paper and piaced on semi-solid pro.iferation medil T. 
ptspared gymnosperm target e* were men grown for 2 days on fi,ter paper dis cs ^ Zs2 „ 

conta,mng me P450-. expression casset* and an expression cassette containing a se.ectab.e mlr 

of se.ectab.e marker expression cassette and piasmid DNA containing me P450-1 expre " 
Precipitated ^n M ^ m ^^/^ 1 ^^^ 
nucroprojec.es were rinsed in absoiute ethanoi and aliauots of 10 /il (5 „ DNA/3ma „ 
onto a macrocarrier. such as those available from BioRad (Hercu.es 

for 5 P T ,0 ^ mbardmeM - Cmbry ° genic ^ desiccated under a steri.e laminar-flow hoo d 
fcr 5 mmutes The desiccated tissue was transferred to semi-so.id pro.iferation medium ^ 
-ropro.ec.es were ac.era.ed into desiccated target ceUs using a BioRad POS-.OOO^rtiCe 

Each plate was bombarded once, rotated 180 degrees, and bombarded a second time Preferred 
bombardment parameters were .350 psi rupture disc pressure. 6 mm distance from the J 

stoppmg screen to culture plate (microcarrier travel distance) Tissue was ,h,n , , , 

medium containing hygromycin B „ «. J J.ZZZ ' ^ 

— p^T «—""—"->«• — ~ ™ —« -Ming „ „« 
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Example 7 - firlrrrinp TriMfnrmPr. j mn rr|h 

After insertion of the P450-2 expression cassette and the selectable martyr 

exposure to an antibiotic that causes moruUty of any ce, ls „ ot containing the GS L ex^ 
Forty independent ceU lines were estab.ished from cu.tures co-bombardl^ ZZ ^ 
containing a hygromycin resistance gene construe, and the P450-1 construct tL 
lines V2. Y17, Y7 and 04. as discussed in more detail be)ow ^ - *— 

PCR techniques were then used to verify that the P450-I een p u»a k 

UP450-iml-S primer: ATGGCTTTCCTTCTAATACCCATCTC and 
LiWSWml-A primer: GGGTGTAATGGACGAGCAAGGACTTG 

.« „ ~» «„,„. „. pcr LJir^c,^: m, ° 

Lane 5 contained a DNA size marker Phi 174/HaeIII (mi \ n,,, , ■ 
indicate mo.ecu.ar si Z es of 1353. 1078 , 872 and 503 bp ' "* *" "* 

Unes 6-9 contained PCR amplification produce 0/ hygromvein R t 
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referenced to as PT52. Lane 7 contained transgenic line Y7. Lane 8 contained transgenic line 04. 
Lane 9 contained the plasmid which includes the expression cassette containing the gene encoding the 
enzyme which confers resistance to the antibiotic hygromycin B. Lanes 7-9 ail show an amplified 
fragment of about lOOObp, indicating that the hygromycin gene has been successfully inserted into 
transgenic lines Y7 and 04. 

These PCR results confirmed the presence of P450-1 and hygromycin resistance gene in 
transformed loblolly pine cell cultures. The results obtained from the PCR verification of 4 cell lines, 
and similar tests with the remaining 36 cell lines, confirm stable integration of the P450-1 gene and 
the hygromycin B gene in 25% of the 40 cell lines. 

In addition, loblolly pine embryogenic cells which have been co-bombarded with the P450-2 
and hygromycin B expression cassettes, are growing vigorously on hygromycin selection medium, 
indicating that the P450-2 expression cassette was successfully integrated into the gymnosperm genome. 

Although various embodiments and features of the invention have been described in the 
foregoing detailed description, those of ordinary skill will recognize the invention is capable of 
numerous modifications, rearrangements and substitutions without departing from the scope of the 
invention as set forth in the appended claims. For example, in the case where the lignin DNA sequence 
is transcribed and translated to produce a functional syringyl lignin gene, those of ordinary skill will 
recognize that because of codon degeneracy a number of polynucleotide sequences will encode the same 
gene. These variants are intended to be covered by the DNA sequences disclosed and claimed herein. 
In addition, the sequences claimed herein include those sequences with encode a gene having substantial 
functional identity with those claimed. Thus, in the case of syringyl lignin genes, for example, the 
DNA sequences include variant polynucleotide sequences encoding polypeptides which have substantial 
identity with the amino acid sequence of syringyl lignin and which show syringyl lignin activity in 
gymnosperms. 
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What is claimed is; 

1 . A method for modifying the genome of a gymnosperm which comprises cloning one or 
more angiosperm DNA sequences which code for genes necessary for production of angiosperm 
syringyl lignin monomer units, fusing one or more of the angiosperm DNA sequences to a 
promoter region associated with a gene to form an expression cassette and inserting the 
expression cassette into the gymnosperm genome to thereby produce a modified genome in the 
gymnosperm containing genes which code for enzymes which produce syringyl lignin monomer 



units. 



2. The method of claim 1, further comprising incorporating a genetic sequence which 
codes for anti-sense mRNA into the gymnosperm genome in order to suppress formation of 
guaiacyl lignin monomer units. 

3. A gymnosperm plant containing an expression cassette produced according to the 

method of claim 1. 

4. A loblolly pine containing an expression cassette produced according to the method of . 
claim 1 . 

5. The method of claim 1 wherein the angiosperm DNA sequences are selected from the 
class consisting of 4-coumarate CoA ligase (4CL), bifunctional-O-methyl transferase (bi-OMT) 

and P450-1 and P450-2. 

6. The method of claim 1 wherein the promoter region is selected from the class 
consisting of the 5' flanking region of phenylalanine ammonia-lyase (PAL) and the 5' flanking 
region of 4-coumarate CoA ligase (4CL1B and 4CL3B). 

7. The method of claim 1 wherein the expression cassette is inserted into the 
gymnosperm genome by way of the transformation vector Agrobacterium. 

8. The method of claim 7 wherein the Agrobacterium is Agrobacterium tumefaciens 
EH101. 

9. The method of claim 1 wherein the expression cassette is inserted into the 
gymnosperm genome via direct DNA delivery to a target cell. 

10. The method of claim 1 wherein the expression cassette is inserted into the 
gymnosperm genome by micro-projectile bombardment of a gymnosperm cell. 

11. The method of claim 1 wherein the expression cassette is inserted into the 
gymnosperm genome by electroporation of a gymnosperm cell. 

12. The method of claim 1 wherein the expression cassette is inserted into the 
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gymnosperm genome via silicon carbide whiskers. 

13. The method of claim 1 wherein the expression cassette is inserted into the 
gymnosperm genome via transformed protoplast. 

14. The method of claim 1 further comprising inserting a selectable marker into the 
expression cassette, 

15. The method of claim 14 wherein the selectable marker is selected from the group 
consisting of kanamycin and hygromycin B. 

16. The method of claim 2 wherein the anti-sense mRNA is a gymnosperm genetic 
sequence which codes for the 4-coumarate C.oA ligase (4CL) gene. 

17. The method of claim 1 wherein the promoter region is a DNA sequence which 
includes the 5' flanking region of the gymnosperm loblolly pine PAL gene. 

18. The method of claim 1 wherein the promoter region is a DNA sequence which 
includes the 5 1 flanking region of the gymnosperm loblolly pine 4CL1B gene. 

19. The method of claim 1 wherein the promoter region is a DNA sequence which 
mcludes the 5 1 flanking region of the gymnosperm loblolly pine 4CL3B gene. 

20. The method of claim 1 wherein the promoter region includes a constitutive promoter 

21. An isolated P450-1 DNA sequence which encodes an enzyme involved in the 
biosynthesis of syringyl H gnin monomer ^ wherein ^ DNA ^ ^ ^ fa ^ ^ 
and 2. 

22. An isolated P450-2 DNA sequence which encodes an enzyme involved in the 
biosynthesis of syringyl lignin monomer units, wherein said DNA is as shown in SEQ ID. No 3 

and 4. 

23. An isolated bi-OMT DNA sequence which encodes an enzyme involved in the 
b,osynthesis of syringyl lignin monomer units, wherein said DNA is as shown in SEQ ID No 5 
and 6. 

24. An isolated 4CL DNA sequence which encodes an enzyme involved in the 
biosynthesis of syringyl lignin monomer units, wherein said DNA is as shown in SEQ ID No 7 
and 8. 

25. An isolated DNA, wherein said DNA encodes for an enzyme involved in the 
biosynthesis of one or more syringyl lignin monomer units. 

26. An isolated DNA sequence which includes the 5' flanking region of the gymnosperm 
loblolly pme PAL gene, containing the lignin promoter region and regulatory elements for 
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gymnosperm lignin biosynthesis as shown in SEQ ID No. 9. 

27. An isolated DNA sequence which includes the 5' flanking region of the gymnosperm 
loblolly pine 4CL1B, containing the lignin promoter region and regulatory elements for 
gymnosperm lignin biosynthesis as shown in SEQ ID No. 10. 

28. An isolated DNA sequence which includes the 5' flanking region of gymnosperm 
loblolly pine 4CL3B, containing the lignin promoter region and regulatory elements for 
gymnosperm lignin biosynthesis as shown in SEQ ID No. 1 1 . 

29. An isolated DNA, wherein said DNA includes the promoter region of a gymnosperm 
gene involved in lignin biosynthesis. 

30. A method for modifying the genome of loblolly pine which comprises cloning one or 
more angiosperm DNA sequences which code for enzymes necessary for production of syringyl 
lignin monomer units, fusing one or more of the angiosperm DNA sequences to a promoter 
region to form an expression cassette, and inserting the expression cassette into the loblolly pine 
genome to thereby produce a modified genome in the loblolly pine containing genes which code 
for enzymes which produce syringyl lignin monomer units. 

3 1 . The method of claim 30 wherein the promoter region is a constitutive promoter. 

32. A loblolly pine containing an expression cassette produced according to claim 30. 

33. The method of claim 30 wherein the angiosperm DNA sequence is selected from the 
class consisting of 4-coumarate Co A ligase (4CL), bifunctional-O-methyl transferase (bi-OMT) 
and P450-1 and P450-2. 

34. A loblolly pine containing one or more of the DNA sequences of claim 33. 

35. A loblolly pine containing the angiosperm DNA sequence inserted by the method of 
claim 30. 

36. A method for modifying the genome of loblolly pine which comprises cloning the 
sweetgum P450-1 gene, fusing it to a constitutive promoter to form an expression cassette, and 
inserting the expression cassette into the loblolly pine genome. 

37. A loblolly pine containing the P450-1 gene. 

38. A method for modifying the genome of loblolly pine which comprises cloning the 
sweetgum P450-2 gene, fusing it to a constitutive promoter to form an expression cassette, and 
inserting the expression cassette into the loblolly pine genome. 

39. A loblolly pine containing the P450-2 gene. 

40. A method for modifying the genome of a gymnosperm which comprises cloning the 



-21- 



WO 99/31243 

PCT/US98/26784 

sweetgum P450-1 gene, fusing it to a constitutive promoter to form an expression cassette, and 
inserting the expression cassette into the gymnosperm genome. 

4 1 . A method for modifying the genome of a gymnosperm which comprises cloning the 
sweetgum P450-2 gene, fusing it to a constitutive promoter to form an expression cassette, and 
inserting the expression cassette into a gymnosperm genome. 

42. A gymnosperm containing the P450-1 gene. 

43. A gymnosperm containing the P450-2 gene. 

44. A gymnosperm containing a DNA sequence selected from the class consisting of the 
P450-1 DNA sequence of SEQ ID No. 1 and 2, the P450-2 DNA sequence of SEQ ID No. 3 and 
4, the bi-OMT DNA sequence of SEQ ID No. 5 and 6, and the 4CL DNA sequences of SEQ ID 
No. 7 and 8. 

45. The gymnosperm of Claim 38, further comprising syringyl lignin. 
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Sl-Q ID 5 

<400> 5 

cggcacgagc cctacctcct ttcttggaaa aatttcccca ttcgatcaca aLccgggcct 60 

caaaaa atg gga Lea aca age gaa acg aag atg age ccg agt gaa gco 100 
MeL Gly Ser Thr Ser Glu Thr Lys Met Ser Pro Ser Glu Ala 
1 5 10 

gca gca gca gaa gaa gaa gca tic gta tic gcL atg caa tta acc agt 15G 
Ala Ala Ala Glu Glu Glu Ala Phe Val Phe Ala Met Gin Leu Thr Ser 
15 20 25 30 

get tea gtt ctt ccc atg gtc eta aaa tea gec ata gag etc gac gtc 204 
Ala Ser Val Leu Pro Met Val Leu Lys Ser Ala lie Glu Leu Asp Val 
35 40 45 

tta gaa ate atg get aaa get ggt cca ggt gcg cac ata tec aca tct 252 
Leu Glu lie Met Ala Lys Ala Gly Pro Gly Ala His lie Ser Thr Ser 
50 55 €0 

gac ata gec tct aag ctg ccc aca aag aat cca gat gca gec gtc atg 300 
Asp lie Ala Ser Lys Leu Pro Thr Lys Asn Pro Asp Ala Ala Val Met 
G5 70 75 

ctt gac cgt atg etc cgc etc ttg get age tac tct gtt eta acg tgc 340 
Leu Asp Arg Met Leu Arg Leu Leu Ala Ser Tyr Ser Val Leu Thr Cys 
00 05 .90 

tct etc cgc .acc etc cct gac ggc aag ate gag agg ctt tac ggc ctt 396 
Ser Leu Arg Thr Leu Pro Asp Gly Lys lie Glu Arg Leu Tyr Gly Leu 
95 100 105 110 

gca ccc gtt tgt aaa ttc ttg acc aga aac gat gat gga gtc tec ata 444 
Ala Pro Vol Cys Lys Phe Leu Thr Arg Asn Asp Asp Gly Val Ser He 
115 120 125 

gec get ctg tct etc atg aat caa gac aag gtc etc atg gag age tgg 492 
Ala Ala Leu Ser Leu Met Asn Gin Asp Lys Val Leu Met Glu Ser Trp 
130 135 140 

tac cac ttg acc gag gca gtt ctt gaa ggt gga att cca ttt aac aag 540 
Tyr His Leu Thr Glu Ala Val Leu Glu Gly Gly He Pro Phe Ash Lys 

145 150 155 

Fig. 2A 
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SEQ1D5 

gcc tat gga atg aca gca tut gag tac cat ggc acc gat ccc aga ttc 500 
Ala Tyr Gly Met Thr Ala Phc Glu Tyr Mis Gly Thr Asp Pro Arg Phe 
1G0 165 170 

aac aca gtt ttc aac aat gga atg tec aat cat teg acc alt acc atg 636 
Asn Thr Val The Asn Asn Gly Met Ser Asn Mis Scr Thr He Thr Met 
17S 100 105 190 

i*a*4 aaa ate ctt gag act tac aaa ggg ttc gag gga clt gga tct gtg 604 
Lys Lys He Leu Glu Thr Tyr Lys Gly Phe Glu Gly Leu Gly Ser Val 
195 200 205 

gtt gat gtt ggt ggt ggc act ggt gcc cac ctt aac atg att ate get 732 
Val Asp Val Gly Gly Gly Thr Gly Ala His Leu Asn Met He lie Ala 
210 215 220 

aaa toe ccc atg ate aag ggc att aac ttc gac ttg cct cat gtt att 700 
Lys Tyr Pro Met He Lys Gly He Asn Phe Asp Leu Pro Mis Val He 
225 230 235 

gag gag get ccc tec tat cct ggt gtg gag cat gtt ggt gga gat atg 020 
Glu Glu Ala Pro Ser Tyr Pro Gly Val Glu His Val Gly Gly Asp Met 
240 245 250 

ttt gtt agt gtt cca aaa gga gat gcc att ttc atg aag tgg ata tgt 076 
Phe Val Ser Val Pro Lys Gly Asp* Ala He Phe Mot Lys Trp He Cys 
255 260 265 270 

cat gat tgg age gat gaa cac tgc ttg aag ttt ttg aag aaa tgt tat 924 
His Asp Trp Ser Asp Glu Mis Cys Leu Lys Phe Leu Lys Lys Cys Tyr 
275 200 205 

gaa gca ctt cca acc aat ggg aag gtg ate ctt get gaa tgc ate etc 972 
Glu Ala Leu Pro Thr Asn Gly Lys Val He Leu Ala Glu Cys He Leu . 
290 295 300 

ccc gtg gcg cca gac gca age etc ccc act aag gca gtg gtc cat att 1020 
Pro Val Ala Pro Asp Ala Ser Leu Pro Thr Lys Ala Val Val His He 
305 310 315 

gat gtc ate atg ttg get cat aac cca ggt ggg aaa gag aga act gag 1OG0 
Asp Val lie Met Leu Ala His Asn Pro Gly Gly Lys Glu Arg Thr Glu 
320 325 330 

aag gag ttt gag gcc ttg gcc aag ggg get gga ttt gaa ggt ttc cga 1116 
Lys Glu Phe Glu Ala Leu Ala Lys Gly Ala Gly Phe Glu Gly Phe Arg 

Fig, 2B 
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SIZQID5 

335 340 345 350 

gta gta gcc teg tgc get tac aat oca tgg ate ate gaa ttt ttg aag 1164 
Val Val Ala Ser Cys Ala Tyr Asn Thr Trp I1g lie Glu Phe Leu Lys 
355 360 365 

aag att tgagtcctta cteggctttg agtacataat accaactcct tttggttttc 1220 
Lys lie 

gagattgtga Ltgtgattgt gattgtctct etttegcagt tggccttatg atataatgta 1200 
tcgttaacLc gatcacagaa gtgeaaaaga cagtgaatgt acactgettt ataaaataaa 1340 
aattttaaga ttttgattca tgtaaaaaaa aaaaaaaaaa 1380 



Fig. 2C 
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SEQ ID 6 

<400> G 

net Gly Ser Thr Ser Glu Thr Lys Met Ser Pro Ser Glu Ala Ala Ala 
! 5 10 15 

Ala Glu Glu Glu Ala Phe Val Phe Ala Mot Gin Leu The Ser Ala Ser 
20 25 30 

Vnl l,eu Pro Met Val Leu Lys Ser Ala He Glu Leu Asp Val Leu Glu 
35 40 45 

He Met A.1.1 Lys Ala Gly Pro Gly Ala Mis He Ser Thr Ser Asp He 
50 55 60 

Ala Ser Lys Leu Pro Thr Lys Asn Pro Asp Ala Ala Val Met Leu Asp 
G5 70 75 00 

Arg Met Leu Arc, Leu Leu Ala Ser Tyr Ser Val Leu Thr Cys Ser Leu 
05 90 95 

Arg Thr Leu Pro Asp Gly Lys He Glu Arg Leu Tyr Gly Leu Ala Pro 
100 105 HO 

Val Cys Lys Phe Leu Thr Arg Asn Asp Asp Gly Val Ser He Ala Ala 
115 120 125 

Leu Ser Leu Met Asn Gin Asp Lys Val Leu Met Glu Scr Tip Tyr His 
130 135 140 

Leu Thr Glu Ala Val Leu Glu Gly Gly He Pro Phe Asn Lys Ala Tyr 



145 



150 155 160 



Gly Met Thr Ala Phe Glu Tyr His Gly Thr Asp Pro Arg Phe Asn Thr 
165 170 "5 

Val Phe Asn Asn Gly Met Ser Asn His Ser Thr He Thr Met Lys Lys 
180 105 190 



He Leu 

195 



Glu Thr Tyr Lys Gly Phe Glu Gly Leu Gly Ser Val Val Asp 



200 205 



val Gly Gly Gly Thr Gly Ala Mis Leu Asn Met He lie Ala Lys Tyr 
210 215 220 

Pro Met lie Lys Gly He Asn Phe Asp Leu Pro His Val He Glu Glu 
225 230 235 240 



Fig. 2D 
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SEQ ID 6 



Ala Pro Ser Tyr Pro Gly Val Glu His Val 
245 



Gly Gly Asp Mot Ph e Val 
250 255 



Sor Val Pro Lys Gly Asp Ala Ilo Phe Met Lys Trp U e Cys | Jis Asp 
260 2G5 270 

Tr P Ser Asp Glu Ills Cys Leu Lys Phe Leu Lys Lys Cys Tyr Glu Ala 
275 200 205 

Leu Pro Thr Asn Gly Lys Val He Leu Ala Glu Cys He Leu Pro Val 
290 295 300 



Ala Pro Asp Ala Ser Leu Pro Thr Lys Ala Val Val His He Asp Val 
305 310 315 P 320 

He Met Leu Ala His Asn Pro Gly Gly Lys Glu Arc Thr Glu Lys Glu 

325 3 30 335 

Phe Glu Ala Leu Ala Lys Gly Ala Gly Phe Glu Gly Phe Arg Val Val 



340 



345 



350 



Ala Ser Cys Ala Tyr Asn Thr Trp He He Glu Phe Leu Lys Lys He 
355 360 



365 



Fig. 211 
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SI3QID7 



<400> 7 

cggcacgagc tea lit ticca cttctggttt gatctctgea attctlccat cagtcccLa 59 

atg gag acc caa aca aaa caa gaa gaa ate ata tat egg teg aaa etc 107 
Met Glu Thr Gin Thr Lys Gin GJLu Glu lie lie Tyr Arg Ser Lys Leu 
1 5 10 15 

ccc gat ate tac ate ccc aaa cac etc cct tta cat teg tat tgt ttc 155 
Pro Asp lie Tyr lie Pro Lys His Leu Pro Leu His Scr Tyr Cys Phe 
20 25 30 

gag aac ate tea cag ttc ggc tec cgc ccc tgt ctg ate aat ggc gca 203 
Glu Asn He Scr Gin Phe Gly Ser Arg Pro Cys Leu He Asn Gly Ala 
35 40 45 

acg ggc aag tat tac aca tat get gag gtt gag etc att gcg cgc aag 251 
Thr Gly Lys Tyr Tyr Thr Tyr Ala Glu Val Glu Leu lie Ala Arg Lys 
50 55 60 

gtc gca tec ggc etc aac aaa etc ggc gtt cga caa ggt gac ate ate 299 
Val Ala Ser Gly Leu Asn Lys Leu Gly Val Arg Gin Gly Asp He lie 
65 70 75 00 

atg ctt ttg eta ccc aac teg ccg gag ttc gtg ttt tea att etc ggc 347 
Met Leu Leu Leu Pro Asn Ser Pro Glu Phe Val Phe Ser He Leu Gly 
85 90 95 

gca tec tac cgc ggg get gec gee acc gee gca aac ccg ttt tat acc 395 
Ala Ser Tyr Arg Gly Ala Ala Ala Thr Ala Ala Asn Pro Phe Tyr Thr 
100 105 110 

cct gec gag ate ag<j aag caa gec aaa acc tec aac gec agg ctt att 44 3 
Pro Ala Glu He Arg Lys Gin Ala Lys Thr Ser Asn Ala Arg Leu He 
115 120 125 

ate aca cat gec tgt tac tat gag aaa gtg aag gac ttg gtg gaa gag 491 
He Thr (lis Ala Cys Tyr Tyr Glu Lys Val Lys Asp Leu Val Glu Glu 
130 135 140 

aac gtt gee aag ate ata tgt ata gac tea ccc ccg gac ggt tgt ttg 539 
Asn Val Ala Lys He He Cys He Asp Ser Pro Pro Asp Gly Cys Leu 
145 ' 150 155 160 



Fig. 3A 
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SEQ 10 1 

cac ttc Leg g..g ctg agt gag gcg gac gag aac gac atg ccc aat gta 007 

His Plie Ser Glu Leu Ser Glu Ala Asp Glu Asn Asp Met Pro Asn Val 
165 170 175 

gag att gac ccc gat gnt gtg gtg gcg ctg ccg tac teg tea ggg acg 635 
Glu He Asp Pro Asp Asp Val Val Ala Leu Pro Tyr Scr Ser Gly Thr 
100 185 190 

acg ggt tta cca aag ggg gtg atg eta aca cac aag gga caa gtg acg C03 
Thr Gly Leu Pro Lys Gly Val Met Leu Thr His Lys Gly Gin Val Thr 
195 200 205 

agt gtg gcg caa cag gtg gac gga gag aat ccg aac ctg tat ata cat 731 
Ser Val Ala Gin Gin Val Asp Gly Glu Asn Pro Asn Leu Tyr He His 
210 215 220 

age gag gac gtg gtt ctg tgc gtg ttg cct ctg ttt cac ate tac teg 779 
Ser Glu Asp Val Val Leu Cys Val Leu Pro Leu Phe His He Tyr Ser 
225 230 235 240 

atg aac gtc atg ttt tgc ggg tta cga gtt ggt gcg gcg att ctg att 027 
Met Asn Val Met Phe Cys Gly Leu Arg Val Gly Ala Ala He Leu He 
245 250 255 

atg cag aaa ttt gaa ata tat ggg ttg tta gag ctg gtc aga agt aca 075 
Met Gin Lys Phe Glu He Tyr Gly Leu Leu Glu Leu Val Arg Ser Thr 
260 265 270 

ggt gac cat cat gec tat cgt aca ccc ate gta ttg gca ate tec aag 923 
Gly Asp His His Ala Tyr Arg Thr Pro He Val Leu Ala lie' Ser Lys 
275 200 285 

act ccg gat ctt cac aac tat gat gtg tec tec att egg act gtc atg 971 
Thr Pro Asp Leu His Asn Tyr Asp Val Ser Ser He Arg Thr Val Met 
290 295 300 

tea ggt gcg get cct ctg ggc aag gaa ctt gaa gat tct gtc aga get 1019 
Ser Gly Ala Ala Pro Leu Gly Lys Glu Leu Glu Asp Ser Val Arg Ala 
305 310 315 320 

aag ttt ccc acc gee aaa ctt ggt cag gga tat gga atg acg gag gca 1067 
Lys Phe Pro Thr Ala Lys Leu Gly Gin Gly Tyr Gly Met Thr Glu Ala 
325 330 335 

ggg ccc gtg eta gcg atg tgt ttg gca ttt gec aag gaa ggg ttt gaa 1115 

Gly Pro Val Leu Ala Met Cys Leu Ala Phe Ala Lys Glu Gly Phe Glu 

340 345 350 

Pig. 3D 
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SIIQID7 

ata °° a tCg 999 gca tct gga act gtt tta ag.j aac yea cng atg aag 
He Lys Scr Gly Ala Ser Gly Thr Val Leu Arg Asn Ala Gin Met Lys 
355 3GO 365 

att gtg gac cot gaa acc ggt gtc oct etc cct cga aac caa ccc gga 
He Val Asp Pro Glu Thr Gly Val V| ir Leu Pro Arg Asn Gin Pro Gly 
370 375 300 . 



1211 



gag att tgc att aga gga gac caa ate atg aaa ggt tat ctt aat gat 1259 
Glu lie Cys lie Arg Gly Asp Gin lie Met Lys Gly Tyr Leu Asn Asp 
305 390 395 40 0 

cct gag gcg n-rg gag aga acc ata gac aag gaa ggt tgg tta cac aca 1307 
Pro Glu Ala Thr Glu Arg Thr He Asp Lys Glu Gly Trp Leu His Thr 
405 " 410 < 1S 

ggt gat gtg ggc tac ate gac gat gac act gag etc ttc att gtt gat 1355 
Gly Asp Val Gly Tyr He Asp Asp Asp Thr Glu Leu Phe He Val Asp 
420 ^25 430 



egg ttg aag gaa etg ate aaa tac aaa ggg ttt eag gtg gca ccc get 
Arg Leu Lys Glu Leu He Lys Tyr Lys Gly Phe Gin Val Ala Pro Ala 
435 440 445 



1403 



gag ctt gag gee atg etc etc aac cat ecc aac ate tct gat get gec 14S1 
Glu Leu Glu Ala Met Leu Leu Asn (lis Pro Asn He Ser Asp Ala Ala 
450 455 4 60 



gtc gtc cca atg aaa gac gat gaa get gga gag etc cct gtg gcg ttt 

Val Val Pro Met Lys Asp Asp Glu Ala Gly Glu Leu Pro Val Ala Phe 

465 470 a-ic 

w 475 400 



1499 



gtt gta aga tea gat ggt tct eag ata tee gag get gaa ate agg caa 1547 
Val Val Arg Ser Asp Gly Ser Gin lie Ser Glu Ala Glu He Arg Gin 

495 



485 49 0 



tac ate gca aaa cag gtg gtt ttt tat aaa aga ata cat cgc gta ttt 1595 
Tyr lie Ala Lys Gin Val Val Phe Tyr Lys Arg He His Arg Val Phe 



500 505 



510 



ttc gtc gaa gec att cct aaa gcg ccc tct ggc aaa ate ttg egg aag 1G43 
Phe Val Glu Ala He Pro Lys Ala Pro Ser Gly Lys He Leu Arg L ys ' 
515 520 



525 



gac etg aga gec aaa ttg gcg tct ggt ctt ccc aat taattctcat 
Asp Leu Arg Ala Lys Leu Ala Ser Gly Leu Pro Asn 
530 53S 540 

Fig. 3C 
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SHQU3 7 

u , ctK « t cctttctctt 3t «uc 9 cc aa ca= g aac 9 a ag a g ,c,ca att aaac g c t 1,49 
, ctMtt c. agc „ccca a tt aaa gCtgC tcaU^U C cac Cg a gtg gg ca g cc tg L l0 09 
ctt,t t „g. t,«C:«c. tttg a tt ca g ct.^aa, cca g accc t c "69 
gtg aaa t Uc 0 caa g aa tgt c tg taaatc g a t ,tt,t fl .,t g a tgggtt Uc aaaacac.UU 1929 
tga ca ttgt U t ac gttgt au tt~t,ct,t f.c fcttt fl t.t .~tttl.lt 1909 

2025 

Lgggaagata acctttcaaa aaaaaaaaaa aaaaaa 



l-ig. 3D 
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SMQ ID 8 

<400> 0 

Met Glu Thr Gin Thr Lys Gin Glu Glu He He Tyr Arg Scr Lys Leu 
15 10 15 

Pro Asp He Tyr He Pro Lys His Leu Pro Leu His Ser Tyr Cys Phe 
20 25 30 

Glu Asn He Ser Gin Phe Gly Ser Arg Pro Cys Leu 1 l.o Asn Gly Ala 
35 40 45 

Thr Gly Lys Tyr Tyr Thr Tyr Ala Glu Val Glu Leu He Ala Arg Lys 
50 55 60 

Val Ala Ser Gly Leu Asn Lys Leu Gly Val Arg Gin Gly Asp He He 
65 70 75 00 

Met Leu Leu Leu Pro Asn Ser Pro Glu Phe Val Phe Ser He Leu Gly 
05 90 95 

Ala Ser Tyr Arg Gly Ala Ala Ala Thr Ala Ala Asn Pro Phe Tyr Thr 
100 105 HO 

Pro Ala Glu He Arg Lys Gin Ala Lys Thr Ser Asn Ala Arg Leu He 
115 120 125 

He Thr His Ala Cys Tyr Tyr Glu Lys Val Lys Asp Leu Val Glu Glu 
130 135 140 

Asn Val Ala Lys He He Cys He Asp Ser Pro Pro Asp Gly Cys Leu 
145 150 155 160 

His Phe Ser Glu Leu Ser Glu Ala Asp Glu Asn Asp Met Pro Asn Val 
165 170 175 

Glu He Asp Pre Asp Asp Val Val Ala Leu Pro Tyr Ser Ser Gly Thr 
100 105 190 

Thr Gly Leu Pro Lys Gly Val Met Leu Thr His Lys Gly Gin Val Thr 
195 200 205 

Ser Val Ala Gin Gin Val Asp Gly Glu Asn Pro Asn Leu Tyr He His 
210 215 220 

Ser Glu Asp Val Val Leu Cys Val Leu Pro Leu Phe His He Tyr Ser 
225 230 235 240 



Fig.3E 
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S12Q1D8 

Mot Asn Val Met Phe Cys Gly Leu Arg Val Gly Ala Ala He Leu He 
245 250 



MGL Gin Lys l>h « Giu U° Tyr GJy Leu Leu Glu Leu Val 



260 



265 



255 



Ary Scr Thr 



270 



Gly Asp His | Iis Ala Tyr Arc, Thr Pro He Val Leu Ala He Ser Ly 



275 



200 



205 



Thr Pro Asp Leu His Asn Tyr Asp Val Ser Ser He Arg Thr Val 



295 



Met 



300 



Ser Gly Ala Ala Pro Leu Gly Lys Glu Leu Glu Asp Ser Val Arg Ala 



310 



315 



320 



Lys Phe Pro Thr Ala Lys Leu Gly Gin Gly Tyr Gly Met Thr Glu Ala 
325 330 335 



Ala Phe Ala Glu Gly Phe Glu 

340 3 <5 350 



y Tnr Val hGU Arg Asn Ala Gin Met Lys 
355 36 <> 365 



Gly Pro Val Leu Ala Met Cys Leu 
340 

He Lys Ser Gly Ala Ser Gl 
355 

He Val Asp Pro Glu Thr Gly Val Thr Leu Pro Arg Asn Gin Pro Gly 
375 300 

Glu He Cys lie Arg Gly Asp Gin He Met Lys Gly Tyr Leu As „ Asp 



395 



400 



Pro Glu Ala Thr Glu Arg Thr lie Asp Lys Glu Gly Trp Leu His T 



405 



410 



hr 



415 



Gly Asp Val Gly Tyr He Asp Asp Asp Thr Glu Leu Phe He Val A~n 
420 <25 430 

Arg Leu Lys Glu Leu lie Lys Tyr Lys Gly Phe Gin Val Ala Pro Ala 



445 



Glu Leu Glu Ala Met Leu Leu Asn His Pro Asn He Ser Asp Ala Ala 
450 4 " 460 



Val Val Pro Met Lys Asp Asp Glu Ala Gly Glu Leu Pro Val 
465 <70 475 



Ala Phe 
400 



Fig.3F 
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St-QIDS 



Val Val Arg Ser A,p Gly Ser Gin He Scr Glu Ala Clu Jle Arg Gln 
105 



495 



Tyr lie Ala Ly 3 Gin Val Val l'ho Tyr Lys Arg Il e ] ti 
500 505 



s Arg Val Phc 
510 



The Val Glu Ala lie Pro Lys . Ln Pro 



515 



Ser Gly Lys lie Leu Arg Lys 



525 



Asp Leu Arg Ala Lys Leu Ala Ser -Gly Leu Pro Asn 
530 535 540 



Tig. 3G 
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SliQ ID 1 



cggcacgagg aaaccctaaa actcacctct cttacccttc ctcttca atg get ttc 

Met Ala Phe 



ctu eta ata ccc ate tea ata ate ttc ate gtc tta get tac cag etc 10*1 

Leu Leu lie I'ro He Ser lie He Phe Ho Val Leu Ala Tyr Gin Leu 

5 10 15 

tat caa egg etc aga ttt aog etc cca ccc ggc cca cgt cca tgg ccg 152 

Tyr Gin Arcj Leu Arg Phe Lys Leu Pro Pro Gly Pro Arg Pro Trp Pro 

20 25 30 35 

ate gtc gga aac ctt toe gac ata aaa ccg gtg agg ttc egg tgt ttc 200 

lie Val Gly Asn Leu Tyr Asp He Lys Pro Val Arg Phe Arg Cys Phe 

40 45 50 



gec gag tgg tea caa gcg tac ggt ccg ate ata teg gtg tgg ttc ggt 
Ala Glu Trp Ser Gin Ala Tyr Gly Pro He He Ser Val Trp Phe Gly 



210 



tea acg ttg oat gtg ate gta teg aat teg. gaa ttg get aag gaa gtg 296 
Ser Thr Leu Asn Val He Val Scr Asn Ser Glu Leu Ala Lys Glu Val 
70 75 00 

etc aag gaa aaa gat caa caa ttg get gat agg cat agg agt aga tea 34 4 

Leu Lys Glu Lys Asp Gin Gin Leu Ala Asp Arg His Arg Ser Arg Ser 
05 90 95 

get gee aaa ttt age agg gat ggg cag gac ctt ata tgg get gat tat 392 

Ala Ala Lys Phe Ser Arg Asp Gly Gin Asp Leu He Trp Ala Asp Tyr 

100 105 110 115 

gga cct cac tat gtg aag gtt aca aag gtt tgt ace etc gag ctt ttt 440 

Gly Pro His Tyr Val Lys Val Thr Lys Val Cys Thr Leu Glu Leu Phe 
120 125 130 

act cca aag egg ctt gaa get ctt aga ccc att aga gaa gat gaa gtt 400 

Thr Pro Lys Arg Leu Glu Ala Leu Arg Pro He Arg Glu Asp Glu Val 
135 140 145 

aca gee atg gtt gag tec att ttt aat gac act gcg aat cct gaa aat 536 

Thr Ala Met Val Glu Ser Ha Plio Asn Asp Thr Ala Asn Pro Glu Asn 
150 155 1G0 

Fig. AA 
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SI3Q 10 I 

La L ggg aag agt atg ctg gtg aag aag tat ttg gga gca gta gca ttc 504 
Tyr Gly Lys Scr Met Leu Val Lys Lys Tyr Leu Gly Ala Val Ala Phe 
1G3 170 175 

aac aac att aca aga etc gca ttt gga aag cga ttc gtg aat tea gag 632 
Asn Asn lie Tin: Ai:g Leu Ala Phe Gly Lys Arg Phe Val Asn Ser Glu 
100 105 190 195 

ggt gta atg gac gag caa gga ctt gaa tut aag gaa att gtg gec aat 600 
Gly Val Mot Asp Glu Gin Gly Leu Glu Phe Lys Glu lie Vol Ala Asn 
200 205 210 



720 



gga etc aag ctt ggt gec tea ctt gca atg get gag cac att cct tgg 
Gly Leu Lys Leu Gly Ala Ser Leu Ala Met Ala Glu His He Pro Trp 
215 220 225 

etc cgt tgg atg ttc cca ctt gag gaa ggg gec ttt gec aag cat ggg 776 
Leu Arg Trp Met Phe Pro Leu Glu Glu Gly Ala Phe Ala Lys His Gly 
230 235 240 

gca cgt agg gac cga ctt ace aga get ate atg gaa gag cac aca ata 024 
Ala Arg Arg Asp Arg Leu Thr Arg Ala lie Met Glu Glu His Thr He 
245 250 255 

gee cgt aaa aag agt ggt gga gec caa caa cat ttc gtg gat gca ttg 072 
Ala Arg Lys Lys Ser Gly Gly Ala Gin Gin His Phe Val Asp Ala Leu 
260 265 270 275 

etc acc eta caa gag aaa tat gac ctt age gag gac act att att ggg 920 
Leu Thr Leu Gin Glu Lys Tyr Asp Leu Ser Glu Asp Thr He He Gly 
200 205 290 

etc ctt tgg gat atg ate act gca ggc atg gac aca acc gca ate tct 960 
Leu Leu Trp Asp Met He Thr Ala Gly Met Asp Thr Thr Ala He Ser 
295 300 305 

gtc gaa tgg gee atg gec gag tta att aag aac cca agg gtg caa caa 1016 
Val Glu Trp Ala Met Ala Glu Leu He Lys Asn Pro Arg Val Gin Gin 
310 315 320 

aaa get caa gag gag eta gac aat gta ctt ggg tec gaa cgt gtc ctg 1064 
Lys Ala Gin Glu Glu Leu Asp Asn Val Leu Gly Scr Glu Arg Val Leu 
325 330 335 



Fig. 4B 
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SEQ ID 1 

acc gaa ttg gac LLc Lea aye cLc cct tat eta caa tgt gta gec aag m? 
Ihr Glu Leu Asp Phe Ser Ser Leu Pro Tyr Leu Gin Cys Val Ala Lvs 
340 345 



350 



355 



gag gca eta agg ctg cac cct cca aca cca eta atg etc cct cat cgc HGO 
Glu Ala Leu Arg Lou His Pro Pro Thr Pro Leu Met Leu Pro His Arq 
360 365 



370 



gec aat gec aac gtc aaa att ggt ggc tac gac ate cct aag gga tea 
Ala Asn Ala Asn Val Lys lie Gly Gly Tyr Asp lie Pro Lys Gly Ser 
375 300 



305 



aat gtt cat gta aat gtc tgg gec gtg get cgt gat cca gca gtg tqq 
Asn Val His Val Asn Val Trp Ala Val Ala Arg Asp Pro Ala Val Tro 
390 395 



400 



cgt gac cca eta gag ttt cga ccg gaa egg ttc tct gaa gac gat gi e 
Arg Asp Pro Leu Glu Phe Arg Pro Glu Arg Phe Ser Glu Asp Asp Val 
405 no 



415 



4 30 



435 



cgt gtt tgc ccc ggt gca caa ctt ggc ate aat ttg gtc aca tec atg 
Arg Val Cys Pro Gly Ala Gin Leu Gly He Asn Leu Val Thr Ser Met 



440 445 



450 



atg ggt cac eta ttg cac cat ttc tat tgg age cct cct aaa ggt gta 
Met Gly His Leu Leu His His Phe Tyr Trp Ser Pro Pro Lys Gly Val 



455 460 



4 65 



1200 



1256 



1304 



gac atg aaa ggt cac gat tat agg eta ctg ccg ttt ggt gca ggg ago l 3S2 
Asp Met Lys Gly His Asp Tyr Arg Leu Leu Pro Phe Gly Ala Gly Aro 
420 425 



1400 



1448 



1496 



aaa cca gag gag att gac atg tea gag aat cca gga ttg gtc acc tac 
Lys Pro Glu Glu He Asp Met Ser Glu Asn Pro Gly Leu Val Thr Tyr 
470 475 400 

atg cga acc ccg gtg caa get gtt ccc act cca agg ctg cct get cac 1544 
Met Arg Thr Pro Val Gin Ala Val Pro Thr Pro Arg Leu Pro Ala His 
405 490 495 

ttg tac aaa cgt gta get gtg gat atg taattcttag tttgttatta i 591 
Leu Tyr Lys Arg Val Ala Val Asp Met 
500 505 



Fig.4C 
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SI3Q ID I 

ttcatgctct taaggttttg gacLLLgaac ttatgatgag a t l Lgtaaaa ttccaagtga 1G51 
icaaatgaag aaaagaccan ataaaoaggc ttgacgattt aaaaaaaaaa aaaaaaa 1700 
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SGQID2 

Met Ala Phe Leu Leu lie Pro lie Ser lie lie Phe I le Val Leu Ala 
1 5 10 15 

Tyr Gin Leu Tyc Gin Arg Leu Arg Phe Lys Leu Pro Pro Gly Tro Arg 
20 25 30 

Pico Trp Pro lie Vol Gly Asn Leu Tyr Asp lie Lys Pro Val Arg Phe 
35 40 45 

Arg Cys Phe Ala Glu Trp Ser Gin Ala Tyr Gly Pro lie lie Ser Val 
50 55 GO 

Trp Phe Gly Ser Thr Leu Asn Val lie Val Scr Asn Ser Glu Leu Ala 
G5 70 15 00 

Lys Glu Val Leu Lys Glu Lys Asp Gin Gin Leu Ala Asp Arg His Arg 
05 90 95 

Ser Arg Ser Ala Ala Lys Phe Ser Arg Asp Gly Gin Asp Leu lie Trp 
100 105 . 110 

Ala Asp Tyr Gly Pro His Tyr Val Lys Val Thr Lys Val Cys Thr Leu 
115 120 125 

Glu Leu Phe Thr Pro Lys Arg Leu Glu Ala Leu Arg Pro lie Arg Glu 
130 135 140 

Asp Glu Val Thr Ala Met Val Glu Ser He Phe Asn Asp Thr Ala Asn 
145 150 155 160 

Pro Glu Asn Tyr Gly Lys Ser Met Leu Val Lys Lys Tyr Leu Gly Ala 
1G5 170 175 

Val Ala Phe Asn Asn He Thr Arg Leu Ala Phe Gly Lys Arg Phe Val 
180 185 190 

Asn Ser Glu Gly Val Met Asp Glu Gin Gly Leu Glu Phe Lys Glu He 
195 200 205 

Val Ala Asn Gly Leu Lys Leu Gly Ala Ser Leu Ala Mot Ala Glu His 
210 215 220 

lie Pro Trp Leu Arg Trp Met Phe Pro Leu Glu Glu Gly Ala Phe Ala 
225 230 23S 240 

Fig.4E 
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SUQ ID 2 

Lys ll Ls Gly Ala Arg Arg Asp Arg Leu Thr Arg Ala He Met Glu Glu 
245 250 255 



His Thr He Ala Arg Lys Lys Ser Gly Gly Ala Gin Gin Ills Phe Vol 
2G0 



265 2-70 



Asp Ala Leu Leu Thr Leu Gin Glu Lys Tyr Asp Leu Scr Glu Asp Thr 



200 



205 



lie lie Gly Leu Leu Trp Asp Met lie Thr Ala Gly Met Asp Thr Thr 
290 295 300 



Ala lie Ser Val Glu Trp Ala Met Ala Glu Leu He Lys Asn Tro Arg 
305 



310 315 320 



Val Gin Gin Lys Ala Gin Glu Glu Leu Asp Asn Val Leu Gly Ser Glu 
325 330 335 

Arg Val Leu Thr Glu Leu Asp Phe Ser Ser Leu Pro Tyr Leu Gin Cys 
340 315 350 

Val Ala Lys Glu Ala Leu Arg Leu His Pro Pro Thr Pro Leu Met Leu 
355 360 365 

Pro His Arg Ala Asn Ala Asn Val Lys He Gly Gly Tyr Asp He Pro 
370 375 300 

Lys Gly Ser Asn Val His Val Asn Val Tip Ala Val Ala Arg Asp Pro 
385 390 395 400 

Ala Val Trp Arg Asp Pro Leu Glu Phe Arg Pro Glu Arg Phe Ser Glu 
405 410 415 

Asp Asp Val Asp Met Lys Gly His Asp Tyr Arg Leu Leu Pro Phe Gly 
420 425 430 

Ala Gly Arg Arg Val Cys Pro Gly Ala Gin Leu Gly He Asn Leu Val 
435 440 445 

Thr Ser Met Met Gly His Leu Leu His His Phe Tyr Trp Ser Pro Pro 
450 455 460 

Lys Gly Val Lys Pro Glu Glu He Asp Met Ser Glu Asn Pro Gly Leu 
465 470 475 400 
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S1IQ1D2 



val Thr Tyr Met Arg Thr Pro Val Gin Ala Vol Pro Thr Pro Arg Leu 
<°3 490. „* 



Pro Ala His Leu Tyr l.ys Arg 



500 



Vo I Ala Val Asp Met 
505 



I 7 ig. 4G 
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SIIQID3 

<400> 3 

tgcaaacctg cacaaacaaa gagagagoag aagaaaaagg aagagaggag agagagagag 60 

agagagagaa gcc atg got tct LcL ctt cot gaa gcc ttg caa cca eta 109 
Met Asp Ser Ser Leu Jlis Glu Ala Leu Gin Pro Leu 
1 5 10 

ccc aLg acg ctg ttc ttc atl ata cct ttg eta etc tta ttg ggc eta 157 
Pro Met Thr Leu Phe Phe lie lie Pro Leu Leu Leu Leu Leu Gly Leu 
IS 20 25 

gta tct egg ctt cgc cag aga cto cca tac cca cca ggc cca aaa ggc 205 
Val Ser Arg Leu Arg Gin Arg Leu Pro Tyr Pro Pro Gly Pro Lys Gly 
30 35 40 

tta ccg gtg ate gga aac atg etc atg atg gat caa etc act cac cga 253 
Leu Pro Val lie Gly Asn Met Leu Met Met Asp Gin Leu Thr His Arg 
45 50 55 60 

gga etc gcc aaa etc gcc aaa caa tac gc:- ggt eta ttc cac etc aag 301 
Gly Leu Ala Lys Leu Ala Lys Gin Tyr G : Gly Leu Phe His Leu Lys 
65 70 75 

atg gga ttc tta cac atg gtg gcc gtt tec aca ccc gac atg get cgc 349 
Met Gly Phe Leu His Met Val Ala Val Ser Thr Pro Asp Met Ala Arg 
00 05 90 

caa gtc ctt caa gtc caa gac aac ate ttc teg aac egg cca gcc acc 397 
Gin Val Leu Gin Val Gin Asp Asn lie Phe Ser Asn Arg Pro Ala Thr 
95 100 105 

ata gcc ate age tac etc acc tat gac cga gcc gac atg gcc ttc get 4 45 
lie Ala lie Ser Tyr Leu Thr Tyr Asp Arg Ala Asp Met Ala Phe Ala 
110 115 120 

cac tac ggc ccg ttt tgg cgt cag atg cgt aaa etc tgc gtc atg aaa 4 93 
His Tyr Gly Pro Phe Trp Arg Gin Met Arg Lys Leu Cys Val Met Lys 
125 130 135 140 

tta ttt age egg aaa cga gcc gag teg tgg gag teg gtc cga gac gag 541 
Leu Phe Ser Arg Lys Arg Ala Glu Ser Trp Glu Ser Val Arg Asp Glu 
145 150 155 

gtc gac teg gca gta cga gtg gtc gcg tec aat att ggg teg acg gtg 509 
Val Asp Scr Ala Val Arg Val Val Ala Ser Asn He Gly Ser Thr Val 
160 165 170 

Fig.5A 
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SBQ ID 3 

aaL ate ggc gag ctg gtt ttt get ctg acg aag aat al_t act tac ogg C37 
Asn He Gly Glu Leu Val Phc Ala Leu Thr Lys Asn He Thr Tyr Arg 
175 180 105 

gcg get ttt ggg acg ate teg cat gag gac cag gac gag ttc gtg gec G05 
Ala Ala Phe Gly Thr He Scr Mis Glu Asp Gin Asp Glu Phe Val Ala 
190 195 200 

ata ctg caa gag ttt teg cag ctg ttt ggt get ttt oat ata get gat 733 
He Leu Gin Glu Phe Ser Gin Leu Phe Gly Ala Phe Asn He Ala Asp 

205 210 215 220 

ttt ate cct tgg etc aaa tgg gtt cct cag ggg att aac gtc agg etc 701 
Phe He Pro Trp Leu Lys Trp Val Pro Gin Gly He Asn Val Arg Leu 
225 230 235 

aac aag gca cga ggg gcg ctt gat ggg ttt att gac aag ate ate gac 029 
Asn Lys Ala Arg Gly Ala Leu Asp Gly Phe He Asp Lys He He Asp 
240 245 250 

gat cat ata cag aag ggg agt aaa aac teg gag gag gtt gat act gat 077 
Asp His He Gin Lys Gly Scr Lys Asn Ser Glu Glu Val Asp Thr Asp 
255 2G0 265 

atg gta gat gat tta ctt get ttt tac ggt gag gaa gec aaa gta age 925 
Met Val Asp Asp Leu Leu Ala Phe Tyr Gly Glu Glu Ala Lys Val Ser 
270 275 200 

gaa tct gac gat ctt caa aat tec ate aaa etc acc aaa gac aac ate 973 
Glu Ser Asp Asp Leu Gin Asn Ser He Lys Leu Thr Lys Asp Asn lie 
205 290 295 300 

aaa get ate atg gac gta atg ttt gga ggg acc gaa acg gtg gcg tec 1021 
Lys . Ala He Met Asp Val Met Phe Gly Gly Thr Glu Thr Val Ala Ser 
305 310 315 

gcg att gaa tgg gec atg acg gag ctg atg aaa age eca gaa gat eta 1069 
Ala He Glu Trp Ala Met Thr Glu Leu Met Lys Ser Pro Glu Asp Leu 
320 325 330 

aag aag gtc caa caa gaa etc gec gtg gtg gtg ggt ctt gac egg cga 1117 
Lys Lys Val Gin Gin Glu Leu Ala Val V«il Val Gly Leu Asp Arg Arg 
335 340 345 



Fig.5B 
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gtc gaa gag aaa gac ttc gag aag etc acc tac ttg .i..a tgc gta ctg 1165 

Val Glu Glu Lys Asp Phe Glu Lys Leu Thr Tyr Leu l.ys Cys Val Leu 

350 355 3G0 

aag gaa gtc ctt cgc etc coc ceo ccc ate cca etc etc etc cac gag 1213 

Lys Glu Val Lou Arg Lou His Pro Pro lie Pro Leu Leu Leu His Glu 
3G5 370 375 300 

act gec gag gac gec gag gtc ggc ggc tac tac att ccg gcg aaa teg 1261 

Thr Ala Glu Asp Ala Glu Val Gly Gly Tyr Tyr lie Pro Ala Lys Ser 
305 390 395 

egg gtg atg ate aac gcg tgc gec ate ggc egg gac aag aac teg tgg 1309 

Arg Val Met He Asn Ala Cys Ala He Gly Arg Asp Lys Asn Scr Tip 

405 410 



100 



1357 



1405 



gec gac cca gat acg ttt agg ccc tec agg ttt etc aaa gac ggt gtg 
Ala Asp Pro Asp Thr Phe Arg Pro Ser Arg Phe Leu Lys Asp Gly Val 
415 420 425 

ccc gat ttc aaa ggg aac aac ttc gag ttc ate cca ttc ggg tea ggt 
Pro Asp Phe Lys Gly Asn Asn Phe Glu Phe He Pro Phe Gly Ser Gly 
430 435 440 

cgt egg tct t«..c ccc ggt atg caa etc gga etc tac gcg eta gag acg 1453 
Arg Arg Ser Cys Pro Gly Met Gin Leu Gly Leu Tyr Ala Leu Glu Thr 
445 450 455 460 

act gtg get cac etc ctt cac tgt ttc acg tgg gag ttg ccg gac ggg 1501 
Thr Val Ala His Leu Leu His Cys Phe Thr Trp Glu Leu Pro Asp Gly 
465 470 475 

atg aaa ccg agt gaa etc gag atg aat gat gtg ttt gga etc acc gcg 
Met Lys Pro Ser Glu Leu Glu Met Asn Asp Val Phe Gly Leu Thr Ala 
400 485 490 

cca aga gcg att cga etc acc gee gtg ccg agt cca cgc ctt etc tgt 
Pro Arg Ala He Arg Leu Thr Ala Val Pro Ser Pro Arg Leu Leu Cys 
495 500 505 

cct etc tat tgatcgaatg attgggggag ctttgtggag gggcttttat 1646 
Pro Leu Tyr 
510 



1549 



1597 



Fig- 5C 
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ggng.ictcta LataLagatg ggaagtgaaa 
ataLattggg gagggagggg aaaaaaaaaa 
atttcLcttc ctctgtggat aaaagcctcg 
L'jtttgLtta ILUttaLctc LULUlLLgca 



SHQID3 

caacgacagg tgaatgcttg gatttttggt 1706 
Uaatgaaagg aaacj.w.oaga gagaatttga 17GG 
ULLLUaattg Lttttatgtg gagatatttg 1026 
ataacactca aaaatnaaaa aaaaaaa 1003 



Fig. 5D 
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SKQ ID A 

<<tOO> A 

Met Asp Scr Ser Leu His Glu Ala Leu Gin Pro Leu Pro Mot Thr Leu 

5 i0 15 

Plie Phc IJ.e lie Pro Leu Leu Leu Leu Leu Gly Leu Val Ser Arg Leu 
20 25 30 

Arg Gin Arg Lou Pro Tyr Pro Pro Gly Pro Lys Gly Leu Pro Val. He 
35 40 45 

GJy Asn Met Lou Met Met Asp Gin Leu Thr His Arg Gly Leu Ala Lys 
50 55 CO 

Leu Ala Lys GLn Tyr Gly Gly Leu Phe His Leu Lys Mot Gly Ph c Lou 
65 7 ° 75 oo 

His Met Val Ala Val Ser Thr Pro Asp Met Ala Arg Gin Val Leu Gin 
05 90 95 

Val Gin Asp Asn He Phe Scr Asn Arg Pro Ala The He Ala He Ser 
100 105 no 

Tyr Leu Thr Tyr Asp Arg Ala Asp Met Ala Phe Ala Ills Tyr Gly Pro 
115 120 125 

Phe Trp Arg Gin Met Arg Lys Leu Cys Val Met Lys Leu Phe Ser Arg 
130 135 no 

Lys Arg Ala Glu Scr Trp Glu Scr Val Arg Asp Glu Val Asp Scr Ala 
145 150 155 i 60 

Val Arg Val Val Ala Ser Asn He Gly Scr Thr Vol Asn He Gly Glu 
1G5 170 175 

Leu Val Phe Ala Leu Thr Lys Asn He Thr Tyr Arg Ala Ala Phe Gly 
100 105 190 

Thr He Ser His Glu Asp Gin Asp Glu Phe Val Ala He Leu Gin Glu 
195 200 205 

Phe Ser Gin Leu Phe Gly Ala Phe Asn He Ala Asp Phc He Pro Trp 
210 215 220 

Lou Lys Trp Val Pro Gin Gly He Asn Val Arg Leu Asn Lys Ala Arg 
225 230 235 240 



Fig. 5B 
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Gly Ala Leu Asp Gly Phe lie Asp Lys lie lie Asp Asp Jlis lie Gin 
245 250 255 

Lys Gly Ser Lys Asn Scr Glu Glu Val Asp Thr Asp Met Val Asp Asp 
2G0 2G5 270 

Leu Leu Ala Pho Tyr Cly Glu Glu Ala Lys Val Ser Glu Scr Asp Asp 
275 200 205 

Lou Gin Asn Ser lie Lys Leu Thr Lys Asp Asn lie Lys Ala lie MeL 
290 29S 300 

Asp Val Mot Phe Gly Gly Thr Glu Thr Val Ala Ser Ala lie Glu Trp 
305 310 315 320 

Ala Met Thr Glu Leu Met Lys Ser Pro Glu Asp Leu Lys Lys Val Gin 
325 330 335 

Gin Glu Leu Ala Val Val Val Gly Leu Asp Arg Arg Val Glu Glu Lys 
340 345 350 

Asp Plie Glu Lys Leu Thr Tyr Leu Lys Cys Val Leu Lys Glu Val Leu 
355 360 365 

Arg Leu His Pro Pro He Pro Leu Leu Leu His Glu Thr Ala Glu Asp 
370 375 300 

Ala Glu Val Gly Gly Tyr Tyr He Pro Ala Lys Ser Arg Val Met He 
305 390 395 400 

Asn Ala Cys Ala He Gly Arg Asp Lys Asn Ser Trp Ala Asp Pro Asp 
405 410 415 

Thr Phe Arg Pro Ser Arg Phe Lou Lys Asp Gly Val Pro Asp Phe Lys 
420 425 430 

Gly Asn Asn Phe Glu Phe He Pro Phe Gly Ser Gly Arg Arg Scr Cys 
435 440 445 

Pro Gly Met Gin Leu Gly Leu Tyr Ala Leu Glu Thr Thr Val Ala His 
450 455 4C0 

Leu Leu His Cys Phe Thr Trp Glu Leu Pro Asp Gly Met Lys Pro Ser 
465 470 475 400 
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S\ZQ\DA 

Glu Leu Clu Met Asn Asp Vol l»hc Cly Lou Thr Ala Pro Aig Ala lie 
405 490 495 

Ai.cj bcu Tli L" A.l.t V.il L'j;o Sci: I'ro Ary Leu Leu Cys Pro Leu Tyr 
500 505 510 



Fig. 5G 
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<400> 10 

aaacaccaat ttaatgggat LLcagatlrg 
UtattgtaaL cLaaccaatt ctaatttcca 
ccgaaaacag cgaaLgaaat gtctgggLga 
cgggtgttgg cctagccggg atgggggtag 
aatggagttt Ucggggtagg tagtaacgta 
caaaaatcca accgctcctt cacatcgcag 
cactcaatcg aLcgcctgcc gtggttgccc 
accaacaatt ccaggccggc tttctataca 
agccggcctc Lgcttcct t c tcagtagccc 
tacatttgtc agacacgUtt tccgccatLU 
gttcggattg ggattgaatc aattgaaagg 



Sl'QJD 10 

LatcccaUgc Lattggctaa ggcottuttc 60 
ccctggtgtg aactgactga caaatgcggu 120 
tcggtcaaac aagcggtggg cgagagagcg 100 
gtagacggcg Lattaccggc gagttgtccg 240 
gacgtcaatg gaaaaagtca taatctccgt 300 
agttggt.ggc cacgggaccc tccacccact 3C0 
altattcaac catacgccac ttgacLcUtc 420 
atgLactgca caggaaaatc caatataaaa 400 
ccagctcatt caattcttcc cactgcaggc 540 
ttcgcctigtt tctgcggaga atttgatcag GOO 
tttttatttt cagtatttcg atcgccatg 659 



Fig. 6 
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SliQJDli 

<400> 11 

99=cgggt gg tgacatttat tc ataaat tc atctcaaaac aag aag gatt tanaa .. uta 00 
aaag a aa aca aaattutcat cttoacota attataattg tgttcacaaa attcaaactt 120 
aaacccttaa tataaagaat ttctttcaac aatacacttt aatC acnacL tcttcaatca 100 
caacctcctc ca acaaaatt aaaat a gott aataaataaa taaacttaac ta t ttaaaaa 240 
aaaatattat acaaaattta ttaaaacttc aaaataaaca aactttttat acaaaattca 300 
tcaaaacttL aaa.taaagc taaacactga aa atgtgagt acatttaaaa ggacgctgat 3e 0 
cacaaaaatt ttgaaaacat aaacaaactt gaaa Ctctac ctttt aa gaa t gag tttgtc < 20 
gtctcattaa etc.tt.gtt ttatagttcg aatccaatta acgtatcttt tattttatgg 400 
aataagggtg tlttaataag tgattttggg atttttLtag fttt.ttt g tga tat g Ut 540 
atggagtttt taaaaatata tatatatata tatatttttg ggttgagttt ac ttaaaatl 600 . 
tggaaaaggt tggtaagaac tataaattga gttgtgaatg agtgLtttat ggatttttta 660 
agatgUca tttatatatg taattaaaat ttfttttg. .taacaaaaa ttataattgg 720 
ataaaaaatt gttttgttaa a ttt agag u a aaaatttcaa aatctaaaat aattaaocac 700 
tattattttt aaaaaatttc, ttggtaoatt ttatcttata tttaagttoa aatttagaaa 040 
aaattaattt taaatteata aacttttgaa gtcaaatatt ccaaatattt tccaaaatat 900 
taaatctatt ttgcattcaa aatacaattt aoataataaa acttcatgga atagatt a .,j 960 
caatttgtat aaaaaccaaa aaUctc aa at aaaatttaaa ttacaaaaca ttatcaacat 1020 
tatgatttca agaaagacaa taaccagttt ccaataaaat aaaaaacctc atggcccgta 1000 
attaagatct cattaattaa ttcttatttt ttaatttttt tacat.agaaa a tatctttat 1140 
attgtatcc a aga a at a tag aatgttctcg tccagggact attaatctcc aaacaagttt 1200 
caaaatcatt acattaaagc tcatcatgtc atttgtggat tg gaaa ttat attgtataag 1260 
agaaatatag aatgttctcg tctagggact attaatttcc aaacaaattt caaaat:att 1320 

Fifi. 7A 
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SMQ ID II 

acattaaacjc Ucatcatglzc aUttgtggaU tggaaattag acaaaaaaaa tcccaaataU 1300 
ttctctcaat ctcccaaa.it atagLLcgaa ctccatattt tLggaaattg agaaLtttLU 1440 
Lacccaataa tatatttttt tatacaUUtt agagatUtlc cagacatatt tgctctggga 1500 
LUtattggaa tgaaggttga gttataaact ItcagLaaLc caagtaLctzt cggtttLtga 15G0 
agatactaaa Lccattatat aataaaaaca catttLaaac accaatUtaa tgggatttca 1G20 
gatttgtatc ccatgctatt ggctaaggca tttttcttat tgtaatctaa ccaattctaa 1680 
tttccaccct ggtgtgaact gactgacaaa tgcggtccga aaacagcgaa tgaaatgtct 1740 
gggtgalcgg tcaancaagc ggtgggcgag agagcgcggg tgttggccta gccgggatgg 1000 
gggtaggtag acggcgtatt accggcgagt tgtccgaaLg gagttttcgg ggtaggLagt 1A60 
aacgtagacg tcaatggaaa aagtcataat ctccgtcaaa aatccaaccg ctccttcaca 1920 
tcgcagagtt ggtggccacg ggaccctcca cccactcact cgatcgcctg ccgtggttgc 1900 
ccattattca accatacgcc acttgactct tcaccaacaa ttccaggccg gctttctata 2040 
caatgtactg cacaggaaaa tccaatataa aaagccggcc tcLgcttcct tctcagtagc 2100 
ccccagctca ttcaattctt cccactgcag gctacatttg tcagacacgt Uttccgccat 2160 
ttttcgcctg ttitctgcgga gaatttgatc aggttcggat tgggnttgaa tcaattgaaa 2220 
ggtttttatt ttcagtattt cgatcgccat g 22S1 



rig. 7B 
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Sl-Q ID 9 

<A00> 9 

aaagataata Latgtgtatg cctactacta cacattgttt tgaagtgtgt aaucalagtg CO 
caacactagg aggactcaca atgagcactt gltgacaLga aactagctaa atgcccaaca 120 
atat ;agtga aagcLagLLa aactaacccc tttgacLUtc aaqntgatal; a tLUatatcc 100 
cLactacgtc tLcctctttt UgLcLULcLc ttgtgaLtaa acct tccttg aaacaattcl: 2-10 
caaatgtaaa attaaacctt gaaacttgta gagaccaaac ttccctagga gaaaccacat 300 
ttatgacaac atataLacac caacccattg catactataa tatitggaatt acctgcagcg 3C0 
aacgaaagaa acgctgtctc accaactcgt gcactacatc ccgaaactLa accttcccct 420 
gatacagatt gaagagccga aaaaagcgtg catccaaatt tctggLatgg tgaggagccg 400 
aaaaacgcgt gcgcctaatt tutttgagat gggccggaaa ataatgcgtg catctaaatt 540 
LUcacgtgtc gcgbattggc gaggttgcgc CgaatgLgaU cctgtgcgtg agccacattc 600 
aLtccattgg tLgacccgcc ggtaccgcga ggaccgtggg gtcLcacaga tacgcggaUg 660 
gtggatcagc actgagaaga ttagatgatg accaggcggg catttgaagt aaaaacttgg 720 
gggtggttgg caagLacgcg acaaagaggg gtagtgcgca aggaagcgag ttggatgcaa 700 
ataatattac aaagtgggtt ggtgggcatg agcatcaacc agaatgatgt tgttgctggt 040 
tccgtgcaaa ttcLgaccag tagttLgaac aatactaccc aacttgtttt tggtaaaaca 900 
tgaagtgggt aaggagaatt gaacttacgt ctcatggtaa agggcaaggg caaatgactt 960 
aacacatacc tttaactaat aaaaataccc ctaacaaata cgaaaacgaa tgagttatca 1020 
cagaccttca actaataaga tagccatcag acccacatct cctgactgac caaaaacaaa 1080 
tgactUcaac caactaagat acccatcaaa gctaacccac aacccaattc ctcacttccc 1140 
cttaccagac caaccaagca gacctacgcc attaactact ttaggacgtg ggaattgggg 1200 
gtgccaccgu tgaagaatgg cactcagggt Uggtaatccc tccacgCgta tgUagcagtc 1260 
gtttggtgga gacggcgUgt ttgaatgtcc accttccagt ttggagaaca aggaaaUtgg 1320 

! ; ig. 8A 
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gcttatatla ggcctggatc tcttgUttca 
attcaaqaot tcaattgccc Lgccctgctc 
gctctggttt gttcaatittc ttgacccctg 
cgattatata agtcattttg gatccttgca 



SEQ ID 9 

gugcaggagU agltcaggac aggaactagc 1300 

tgctctgcUL tgctcaactt attgatccct 1440 

ctgggtUctg ctctggLLUg cacactttct 1500 

aggaagagaa tatg 1544 
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SEQUENCE LISTING 

<110> Chiang, Vincent L 
Carraway, Daniel T 
Smeltzer, Richard H 

<120> Production of Syringyl Lignin in Gymnosperms 

<130> 50617 

<140> US 08/991, 677 
<141> 1997-12-16 

<150> US 60/033,381 
<151> 1996-12-16 

<160> 11 

<170> Patentln Ver. 2.0 

<210> 1 
<211> 1708 
<212> DNA 

<213> Liquidambar styraciflua 

<220> 

<221> CDS 

<222> (48) - . (1571) 

<400> 1 

cggcacgagg aaaccctaaa actcacctct cttacccttt ctcttca atg get ttc 56 

Met Ala Phe 
1 

ctt eta ata ccc ate tea ata ate ttc ate gtc tta get tac cag etc 104 
Leu Leu lie Pro lie Ser lie lie Phe He Val Leu Ala Tyr Gin Leu 
5 10 15 

tat caa egg etc aga ttt aag etc cca ccc ggc cca cgt cca tgg ccg 152 
Tyr Gin Arg Leu Arg Phe Lys Leu Pro Pro Gly Pro Arg Pro Trp Pro 
20 25 30 35 

ate gtc gga aac ctt tac gac ata aaa ccg gtg agg ttc egg tgt ttc 200 
He Val Gly Asn Leu Tyr Asp He Lys Pro Val Arg Phe Arg Cys Phe 
40 45 50 

gec gag tgg tea caa gcg tac ggt ccg ate ata teg gtg tgg ttc ggt 248 
Ala Glu Trp Ser Gin Ala Tyr Gly Pro He He Ser Val Trp Phe Gly 
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55 60 65 

tea acg ttg aat gtg ate gta teg aat teg gaa ttg get aag gaa gtg 296 
Ser Thr Leu Asn Val lie Val Ser Asn Ser Glu Leu Ala Lys Glu Val 
70 75 80 

etc aag gaa aaa gat caa caa ttg get gat agg cat agg agt aga tea 344 
Leu Lys Glu Lys Asp Gin Gin Leu Ala Asp Arg His Arg Ser Arg Ser 
85 90 95 

get gec aaa ttt age agg gat ggg cag gac ctt ata tgg get gat tat 392 
Ala Ala Lys Phe Ser Arg Asp Gly Gin Asp Leu He Trp Ala Asp Tyr 
100 105 HO H5 

gga cct cac tat gtg aag gtt aca aag gtt tgt acc etc gag ctt ttt 440 
Gly Pro His Tyr Val Lys Val Thr Lys Val Cys Thr Leu Glu Leu Phe 
120 125 130 

act cca aag egg ctt gaa get ctt aga ccc att aga gaa gat gaa gtt 4 88 
Thr Pro Lys Arg Leu Glu Ala Leu Arg Pro He Arg Glu Asp Glu Val 
135 140 145 

aca gec atg gtt gag tec att ttt aat gac act gcg aat cct gaa aat 536 
Thr Ala Met Val Glu Ser He Phe Asn Asp Thr Ala Asn Pro Glu Asn 
150 155 160 

tat ggg aag agt atg ctg gtg aag aag tat ttg gga gca gta gca ttc 584 
Tyr Gly Lys Ser Met Leu Val Lys Lys Tyr Leu Gly Ala Val Ala Phe 
165 170 175 

aac aac att aca aga etc gca ttt gga aag cga ttc gtg aat tea gag 632 
Asn Asn lie Thr Arg Leu Ala Phe Gly Lys Arg Phe Val Asn Ser Glu 
"0 185 190 195 

ggt gta atg gac gag caa gga ctt gaa ttt aag gaa att gtg gee aat 680 
Gly Val Met Asp Glu Gin Gly Leu Glu Phe Lys Glu He Val Ala Asn 
200 205 210 

gga etc aag ctt ggt gec tea ctt gca atg get gag cac att cct tgg 728 
Gly Leu Lys Leu Gly Ala Ser Leu Ala Met Ala Glu His He Pro Trp 
215 220 225 

etc cgt tgg atg ttc cca ctt gag gaa ggg gee ttt gee aag cat ggg 77 6 
Leu Arg Trp Met Phe Pro Leu Glu Glu Gly Ala Phe Ala Lys His Gly 
230 235 240 



gca cgt agg gac cga ctt acc aga get ate atg gaa gag cac aca ata 
Ala Arg Arg Asp Arg Leu Thr Arg Ala He Met Glu Glu His Thr He 



824 
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245 250 25b 

gcc cgt aaa aag agt ggt gga gcc caa caa cat ttc gtg gat gca ttg 872 
Ala Arg Lys Lys Ser Gly Gly Ala Gin Gin His Phe Val Asp Ala Leu 
260 265 270 275 

etc acc eta caa gag aaa tat gac ctt age gag gac act att att ggg 920 
Leu Thr Leu Gin Glu Lys Tyr Asp Leu Ser Glu Asp Thr lie lie Gly 
280 285 290 

etc ctt tgg gat atg ate act gca ggc atg gac aca acc gca ate tct 968 
Leu Leu Trp Asp Met lie Thr Ala Gly. Met Asp Thr Thr Ala lie Ser 
295 300 305 

gtc gaa tgg gcc atg gcc gag tta att aag aac cca agg gtg caa caa 1016 
Val Glu Trp Ala Met Ala Glu Leu He Lys Asn Pro Arg Val Gin Gin 
310 315 320 

aaa get caa gag gag eta gac aat gta ctt ggg tec gaa cgt gtc ctg 1064 
Lys Ala Gin Glu Glu Leu Asp Asn Val Leu Gly Ser Glu Arg Val Leu 
325 330 335 

acc gaa ttg gac ttc tea age etc cct tat eta caa tgt gta gcc aag 1112 
Thr Glu Leu Asp Phe Ser Ser Leu Pro Tyr Leu Gin Cys Val Ala Lys 
340 345 350 355 

gag gca eta agg ctg cac cct cca aca cca eta atg etc cct cat cgc. 1160 
Glu Ala Leu Arg Leu His Pro Pro Thr Pro Leu Met Leu Pro His Arg 
360 365 370 

gcc aat gcc aac gtc aaa att ggt ggc tac gac ate cct aag gga tea 1208 
Ala Asn Ala Asn Val Lys He Gly Gly Tyr Asp He Pro Lys Gly Ser 
375 380 385 

aat gtt cat gta aat gtc tgg gcc gtg get cgt gat cca gca gtg tgg 1256 
Asn Val His Val Asn Val Trp Ala Val Ala Arg Asp Pro Ala Val Trp 
390 395 400 

cgt gac cca eta gag ttt cga ccg gaa egg ttc tct gaa gac gat gtc 1304 
Arg Asp Pro Leu Glu Phe Arg Pro Glu Arg Phe Ser Glu Asp Asp Val 
405 410 415 

gac atg aaa ggt cac gat tat agg eta ctg ccg ttt ggt gca ggg agg 1352 
Asp Met Lys Gly His Asp Tyr Arg Leu Leu Pro Phe Gly Ala Gly Arg 
420 425 430 435 

cgt gtt tgc ccc ggt gca caa ctt ggc ate aat ttg gtc aca tec atg 1400 
Arg Val Cys Pro Gly Ala Gin Leu Gly He Asn Leu Val Thr Ser Met 



3 



WO 99/31243 



PCTYUS98/26784 



440 

atg ggt cac eta ttg cac cat ttc 
Met Gly His Leu Leu His His Phe 
455 

aaa cca gag gag att gac atg tea 
Lys Pro Glu Glu He Asp Met Ser 
470 475 

atg cga acc ccg gtg caa get gtt 
Met Arg Thr Pro Val Gin Ala Val 
485 490 

ttg tac aaa cgt gta get gtg gat 
Leu Tyr Lys Arg Val Ala Val Asp 
500 505 

ttcatgetet taaggttttg gactttgaac ttatgatgag atttgtaaaa ttccaagtga 1651 

tcaaatgaag aaaagaccaa ataaaaaggc ttgacgattt aaaaaaaaaa aaaaaaa 1708 



<210> 2 
<211> 508 
<212> PRT 

<213> Liquidambar styraciflua 
<400> 2 

Met Ala Phe Leu Leu He Pro lie Ser He He Phe He Val Leu Ala 
15 10 15 

Tyr Gin Leu Tyr Gin Arg Leu Arg Phe Lys Leu Pro Pro Gly Pro Arg 
20 25 30 

Pro Trp Pro He Val Gly Asn Leu Tyr Asp He Lys Pro Val Arg Phe 
35 40 45 

Arg Cys Phe Ala Glu Trp Ser Gin Ala Tyr Gly Pro He He Ser Val 
50 55 60 

Trp Phe Gly Ser Thr Leu Asn Val He Val Ser Asn Ser Glu Leu Ala 
65 70 75 80 

Lys Glu Val Leu Lys Glu Lys Asp Gin Gin Leu Ala Asp Arg His Arg 
85 90 95 

Ser Arg Ser Ala Ala Lys Phe Ser Arg Asp Gly Gin Asp Leu He Trp 



445 

tat tgg age cct cct 
Tyr Trp Ser Pro Pro 
460 

gag aat cca gga ttg 
Glu Asn Pro Gly Leu 
480 

ccc act cca agg ctg 
Pro Thr Pro Arg Leu 
495 

atg taattcttag tttgi 
Met 



450 

aaa ggt gta 1448 

Lys Gly Val 

465 

gtc acc tac 1496 
Val Thr Tyr 



cct get cac 1544 
Pro Ala His 



atta 1591 
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100 105 no 

Ala Asp Tyr Gly Pro His Tyr Val Lys Val Thr Lys Val Cys Thr Leu 
115 120 125 

Glu Leu Phe Thr Pro Lys Arg Leu Glu Ala Leu Arg Pro lie Arg Glu 
130 135 140 

Asp Glu Val Thr Ala Met Val Glu Ser lie Phe Asn Asp Thr Ala Asn 
145 150 155 160 

Pro Glu Asn Tyr Gly Lys Ser Met Leu Val Lys Lys Tyr Leu Gly Ala 
165 170 175 

t 

Val Ala Phe Asn Asn lie Thr Arg Leu Ala Phe Gly Lys Arg Phe Val 
180 185 190 

Asn Ser Glu Gly Val Met Asp Glu Gin Gly Leu Glu Phe Lys Glu lie 
195 200 205 

Val Ala Asn Gly Leu Lys Leu Gly Ala Ser Leu Ala Met Ala Glu His 
210 215 220 

lie Pro Trp Leu Arg Trp Met Phe Pro Leu Glu Glu Gly Ala Phe Ala 
225 230 235 240 

Lys His Gly Ala Arg Arg Asp Arg Leu Thr Arg Ala lie Met Glu Glu 
245 250 255 

His Thr lie Ala Arg Lys Lys Ser Gly Gly Ala Gin Gin His Phe Val 
260 265 270 

Asp Ala Leu Leu Thr Leu Gin Glu Lys Tyr Asp Leu Ser Glu Asp Thr 
275 280 285 

He He Gly Leu Leu Trp Asp Met He Thr Ala Gly Met Asp Thr Thr 
290 295 300 

Ala He Ser Val Glu Trp Ala Met Ala Glu Leu lie Lys Asn Pro Arg 
305 310 315 320 

Val Gin Glii Lys Ala Gin Glu Glu Leu Asp Asn Val Leu Gly Ser Glu 
325 330 335 

Arg Val Leu Thr Glu Leu Asp Phe Ser Ser Leu Pro Tyr Leu Gin Cys 
. 340 345 350 

Val Ala Lys Glu Ala Leu Arg Leu His Pro Pro Thr Pro Leu Met Leu 
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355 

Pro His Arg Ala Asn 
370 

Lys Gly Ser Asn Val 
3B5 

Ala Val Trp Arg Asp 
405 

Asp Asp Val Asp Met 
420 

Ala Gly Arg Arg Val 
435 

Thr Ser Met Met Gly 
450 

Lys Gly Val Lys Pro 
4 65 

Val Thr Tyr Met Arg 
485 

Pro Ala His Leu Tyr 
500 



360 

Ala Asn Val Lys lie 
375 

His Val Asn Val Trp 
390 

Pro Leu Glu Phe Arg 
410 

Lys Gly His Asp Tyr 
425 

Cys Pro Gly Ala Gin 
440 

His Leu Leu His His 
455 

Glu Glu He Asp Met 
470 

Thr Pro Val Gin Ala 
4 90 

Lys Arg Val Ala Val 
505 



365 

Gly Gly Tyr Asp He Pro 
380 

Ala Val Ala Arg Asp Pro 
395 400 

Pro Glu Arg Phe Ser Glu 
415 

Arg Leu Leu Pro Phe Gly 
430 

Leu Gly He Asn Leu Val 
445 

Phe Tyr Trp Ser Pro Pro 
4 60 

Ser Glu Asn Pro Gly Leu 
475 480 

Val Pro Thr Pro Arg Leu 
495 

Asp Met 



<210> 3 
<211> 1883 
<212> DNA 

<213> Liquidambar styraciflua 

<220> 

<221> CDS 

<222> (74) . . (1606) 

<400> 3 

tgcaaacctg cacaaacaaa gagagagaag aagaaaaagg aagagaggag agagagagag 60 

agagagagaa gcc atg gat tct tct ctt cat gaa gcc ttg caa cca eta 109 
Met Asp Ser Ser Leu His Glu Ala Leu Gin Pro Leu 
15 10 

ccc atg acg ctg ttc ttc att ata cct ttg eta etc tta ttg ggc eta 157 
Pro Met Thr Leu Phe Phe He He Pro Leu Leu Leu Leu Leu Gly Leu 
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15 



20 



25 



m m l" r : a9 a9a cta cca tac cca c = a ««= «=« — ^ 

Ser Arg Leu Arc, Gin Arg Leu Pro Tyr Pro PrQ Gly prQ Lys " 
35 40 



60 



III IT IT f" 9CC " a t3C " C W ct * "c cac etc aag 

Gly Leu Ala Lys Leu Ala Ly s Gln Tyr Gly Gly Le<J phe ^ £ 



70 



75 



"0 105 



He aT T V" CtC tat 9aC C9a 9« -tg gec ttc get 

He Ala Ile Ser Tyr L .„ rhr Tyr Asp Arg „, Asp ^ ^ £ 

115 120 



130 "5 140 

L^u Phe Ser a" 939 tC9 tg9 939 tc * 9tc cga gac gag 

Leu Phe Ser Arg Lys Arg Ala Glu Ser Trp Glu Ser Val Arg Asp Gl! 



150 



155 



165 170 

180 185 

Ml 111 III HI E E £ £ gT rT 939 "= 9t9 9 ~ 

190 ?! U ASP Gln As P Glu Ph e Val Ala 

195 200 



205 



253 



301 



atg gga ttc tta cac atg Q to acc at-t- t-^ « 

Hot Gl y Phe „ is Me 9 ^ £ £ £ £ ~ £ £ £ - - 

85 90 



397 



445 



4 93 



541 



589 



637 



685 
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205 



210 



215 



220 



ttt ate cct tgg etc aaa tgg gtt cct cag ggg att aac gtc agg etc 
P he lie Pro Trp Leu Lys Trp Val Pro Gin Gly He Asn V.l Arg Leu 
225 230 

aac aag ,=a cga ggg gcg ctt gat ggg ttt att gac aag ate ate gac 
Asn Lys La Arg Gly Ala Leu Asp Gly Phe lie Asp Lys lie He Asp 



240 



245 



n»t eat ata cag aag ggg agt aaa aac teg gag gag gtt gat act gat 
Tst His lie Gin Lys Gly Ser Lys Asn Ser Glu <U« V.l Asp Thr Asp 



255 



260 



781 



829 



877 



925 



atg ata gat gat tta ctt get ttt tae ggt gag gaa gee aaa gta age 
azq gta y y Val Ser 

Met Val Asp Asp Leu Leu Ala me lyr ^j-y 

270 275 280 

aaa tct gac gat ctt caa aat tec ate aaa etc aec aaa gac aac ate 973 
HI Ser Lp Lp Leu Gin Asn Ser He Lys Leu Thr Lys Asp Asn lie 
285 290 295 

aaa get ate atg gae gta atg ttt gga ggg ace gaa acg gtg gcg tee 
Zl III lie Ket L P Val Met Phe Gly Gly Thr Glu Thr Val Ala Ser 



305 



310 



flro att aaa tag gee atg acg gag ctg atg aaa age cca gaa gat eta 
III lie Glu IZ Ala Met T*r Glu Leu Met Lys Ser Pro Glu Asp Leu 
320 325 330 

aag aag gtc caa caa gaa etc gee gtg gtg gtg ggt ctt gae egg cga 
lyl Ly! Val Gin Gin Glu Leu Ala Val Val Val Gly Leu Asp Arg Arg 
335 340 345 

gtc gaa gag aaa gac tte gag aag cte aec tae ttg aaa tge gta ctg 
Val Glu 111 Lys Asp Phe Glu Lys Leu Thr Tyr Leu Lys Cys Val Leu 
350 355 360 

aag gaa gtc ctt ege etc eae eca cee ate eea etc etc etc eae gag 
Ly s Glu Val Leu Arg Leu His Pro Pro He Pro Leu Leu Leu His Glu 



380 

365 



370 375 



1021 



act ace gag gae gee gag gtc gge ggc tac tac att eeg gcg aaa teg 
Thr Ala Glu Lp Ala Glu Val Gly Gly Tyr Tyr He Pro Ala Lys Ser 
385 390 395 

egg gtg atg ate aae gcg tge gee ate gge egg gae aag aac teg tgg 
Arg Val Met He Asn Ala Cys Ala He Gly Arg Asp Lys Asn Ser Trp 



1069 



1117 



1165 



1213 



1261 



1309 
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1501 



400 405 410 

gcc gac cca gat acg ttt agg ccc tec agg ttt etc aaa gac ggt gtg 1357 
Ala Asp Pro Asp Thr Phe Arg Pro Ser Arg Phe Leu Lys Asp Gly Val 
415 ' 420 425 

ccc gat ttc aaa ggg aac aac ttc gag ttc ate cca ttc ggg tea ggt 1405 
Pro Asp Phe Lys Gly Asn Asn Phe Glu Phe lie Pro Phe Gly Ser Gly 
430 435 440 

cgt egg tct tgc ccc ggt atg caa etc gga etc tac gcg eta gag acg 1453 
Arg Arg Ser Cys Pro Gly Met Gin Leu Gly Leu Tyr Ala Leu Glu Thr 
445 450 455 460 

act gtg get cac etc ctt cac tgt ttc acg tgg gag ttg ccg gac ggg 
Thr Val Ala His Leu Leu Hi s Cys Phe Thr Trp Glu Leu Pro Asp Gly 
465 470 475 

atg aaa ccg agt gaa etc gag atg aat gat gtg ttt gga etc acc gcg 154 9 
Met Lys Pro Ser Glu Leu Glu Met Asn Asp Val Phe Gly Leu Thr Ala 
480 4B5 490 

cca aga gcg att cga etc acc gcc gtg ccg agt cca cgc ctt etc tgt 1597 
Pro Arg Ala lie Arg Leu Thr Ala Val Pro Ser Pro Arg Leu Leu Cys 
495 500 505 

cct etc tat tgatcgaatg attgggggag ctttgtggag gggcttttat 1646 
Pro Leu Tyr 
510 

ggagactcta tatatagatg ggaagtgaaa caacgacagg tgaatgcttg gatttttggt 1706 

atatattggg gagggagggg aaaaaaaaaa taatgaaagg aaagaaaaga gagaatttga 17 66 

atttctcttc ctctgtggat aaaagecteg tttttaattg tttttatgtg gagatatttg 1826 

tgtttgttta tttttatctc tttttttgea ataacactca aaaataaaaa aaaaaaa 1883 



<210> 4 
<211> 511 
<212> PRT 

<213> Liquidambar styraciflua 



<400> 4 

Met Asp Ser Ser Leu His Glu Ala Leu Gin Pro Leu Pro Met Thr Leu 
15 10 15 
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Phe Phe lie lie Pro Leu Leu Leu Leu Leu Gly Leu Val Ser Arg Leu 
20 25 30 

Arg Gin Arg Leu Pro Tyr Pro Pro Gly Pro Lys Gly Leu Pro Val lie 
35 40 45 

Gly Asn Met Leu Met Met Asp Gin Leu Thr His Arg Gly Leu Ala Lys 
50 55 60 

Leu Ala Lys Gin Tyr Gly Gly Leu Phe His Leu Lys Met Gly Phe Leu 
65 70 75 80 

His Met Val Ala Val Ser Thr Pro Asp Met Ala Arg Gin Val Leu Gin 
85 90 95 

Val Gin Asp Asn He Phe Ser Asn Arg Pro Ala Thr He Ala He Ser 
100 105 no 

Tyr Leu Thr Tyr Asp Arg Ala Asp Met Ala Phe Ala His Tyr Gly Pro 
115 120 125 

Phe Trp Arg Gin Met Arg Lys Leu Cys Val Met Lys Leu Phe Ser Arg 
130 135 140 

Lys Arg Ala Glu Ser Trp Glu Ser Val Arg Asp Glu Val Asp Ser Ala 
145 150 155 160 

Val Arg Val Val Ala Ser Asn lie Gly Ser Thr Val Asn He Gly Glu 
165 170 175 

Leu Val Phe Ala Leu Thr Lys Asn He Thr Tyr Arg Ala Ala Phe Gly 
180 185 190 

Thr He Ser His Glu Asp Gin Asp Glu Phe Val Ala He Leu Gin Glu 
195 200 205 

Phe Ser Gin Leu Phe Gly Ala Phe Asn He Ala Asp Phe He Pro Trp 
210 215 220 

Leu Lys Trp Val Pro Gin Gly He Asn Val Arg Leu Asn Lys Ala Arg 
225 230 235 240 

Gly Ala Leu Asp Gly Phe He Asp Lys He He Asp Asp His He Gin 
245 250 255 

Lys Gly Ser Lys Asn Ser Glu Glu Val Asp Thr Asp Met Val Asp Asp 
260 265 270 



10 



WO 99/31243 



PCT7US98/26784 



Leu Leu Ala Phe Tyr Gly Glu Glu Ala Lys Val Ser Glu Ser Asp Asp 
275 280 285 

Leu Gin Asn Ser lie Lys Leu Thr Lys Asp Asn lie Lys Ala He Met 
290 295 300 

Asp Val Met Phe Gly Gly Thr Glu Thr Val Ala Ser Ala He Glu Tro 
305 3io 

315 320 

Ala Met Thr Glu Leu Met Lys Ser Pro Glu Asp Leu Lys Lys Val Gin 
325 330 335 

Gin Glu Leu Ala Val Val Val Gly Leu Asp Arg Arg Val Glu Glu Lys 
340 345 350 

Asp Phe Glu Lys Leu Thr Tyr Leu Lys Cys Val Leu Lys Glu Val Leu 
355 3 6 o 365 

Arg Leu His Pro Pro He Pro Leu Leu Leu His Glu Thr Ala Glu Asp 
370 375 3Q0 



Ala Glu Val Gly Gly Tyr Tyr He Pro Ala Lys Ser Arg Val Met 



385 



He 



390 395 400 

Asn Ala Cys Ala He Gly Arg Asp Lys Asn Ser Trp Ala Asp Pro Asp 



405 4!o 



415 



Thr Phe Arg Pro Ser Arg Phe Leu Lys Asp Gly Val Pro Asp Phe Lys 
420 425 43 0 

Gly Asn Asn Phe Glu Phe He Pro Phe Gly Ser Gly Arg Arg Ser Cys 
435 440 445 

Pro Gly Met Gin Leu Gly Leu Tyr Ala Leu Glu Thr Thr Val Ala His 
450 455 460 

Leu Leu His Cys Phe Thr Trp Glu Leu Pro Asp Gly Met Lys Pro Ser 
5 470 475 480 

Glu Leu Glu Met Asn Asp Val Phe Gly Leu Thr Ala Pro Arg Ala He 
485 490 495 

Arg Leu Thr Ala Va l Pro Ser Pro Arg Leu Leu Cys Pro Leu Tyr 
500 505 510 

<210> 5 
<211> 1380 
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<212> DNA 

<213> Liquidambar styraciflua 

<220> 

<221> CDS 

<222> (67) . . (1170) 



cggcacgagc cctacctcct ttcttggaaa aatttcccca ttcgatcaca atccgggcct 60 

caaaaa atg gga tea aca age gaa acg aag atg age ccg agt gaa gca 
Met Gly Ser Thr Ser Glu Thr Lys Met Ser Pro Ser Glu Ala 
1 5 10 



108 



gca gca gca gaa gaa gaa gca ttc gta ttc get atg caa tta acc agt 
Ala Ala Ala Glu Glu Glu Ala Phe Val Phe Ala Met Gin Leu Thr Ser 



15 20 



25 30 



get tea gtt ctt ccc atg gtc eta aaa tea gee ata gag ete gac gtc 
Ala Ser val Leu Pro Met Val Leu Lys Ser Ala lie Glu Leu Asp Val 

40 45 



35 



156 



tta gaa ate atg get aaa get ggt cca ggt gcg eac ata tec aca tct 
Leu Glu lie Met Ala Lys Ala Gly Pro Gly Ala His He Ser Thr Ser 
50 55 60 

gac ata gec tct aag ctg ccc aca aag aat cca gat gca gee gtc atg 
Asp He Ala Ser Lys Leu Pro Thr Lys Asn Pro Asp Ala Ala Val Met 
65 70 75 

ctt gac cgt atg etc egc etc ttg get age tac tct gtt eta acg tgc 
Leu Lp Arg Met Leu Arg Leu Leu Ala Ser Tyr Ser Val Leu Thr Cys 
80 85 90 

tct etc cgc acc etc cct gac ggc aag ate gag agg ctt tac ggc ctt 
Ser Leu Arg Thr Leu Pro Asp Gly Lys He Glu Arg Leu Tyr Gly Leu 
95 100 105 HO 

gca cce gtt tgt aaa ttc ttg acc aga aac gat gat gga gtc tec ata 
Ala Pro Val Cys Lys Phe Leu Thr Arg Asn Asp Asp Gly Val Ser lie 
115 120 I 25 

acc get ctg tct etc atg aat caa gac aag gtc etc atg gag age tgg 
Ala Ala Leu Ser Leu Met Asn Gin Asp Lys Val Leu Met Glu Ser Trp 
130 135 140 

tac cac ttg acc gag gca gtt ctt gaa ggt gga att cca ttt aac aag 
Tyr His Leu Thr Glu Ala Val Leu Glu Gly Gly He Pro Phe Asn Lys 



252 



300 



348 



396 



492 



540 
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145 150 155 

gcc tat gga atg aca gca ttt gag tac cat ggc acc gat ccc aga ttc 588 
Ala Tyr Gly Met Thr Ala Phe Glu Tyr His Gly Thr Asp Pro Arg Phe 
160 165 170 

aac aca gtt ttc aac aat gga atg tec aat cat teg acc att acc atg 636 
Asn Thr Val Phe Asn Asn Gly Met Ser Asn His Ser Thr He Thr Met 
175 180 185 190 

aag aaa ate ctt gag act tac aaa ggg ttc gag gga ctt gga tct gtg 684 
Lys Lys He Leu Glu Thr Tyr Lys Gly Phe Glu Gly Leu Gly Ser Val 
195 200 205 

gtt gat gtt ggt ggt ggc act ggt gcc cac ctt aac atg att ate get 732 
Val Asp Val Gly Gly Gly Thr Gly Ala His Leu Asn Met He He Ala 
210 215 220 

aaa tac ccc atg ate aag ggc att aac ttc gac ttg cct cat gtt att 780 
Lys Tyr Pro Met He Lys Gly He Asn Phe Asp Leu Pro His Val He 
225 230 235 

gag gag get ccc tec tat cct ggt gtg gag cat gtt ggt gga gat atg 82B 
Glu Glu Ala Pro Ser Tyr Pro Gly Val Glu His Val Gly Gly Asp Met 
240 245 250 

ttt gtt agt gtt cca aaa gga gat gcc att ttc atg aag tgg ata tgt 876 
Phe Val Ser Val Pro Lys Gly Asp Ala He Phe Met Lys Trp He Cy~ 
255 260 265 270 

cat gat tgg age gat gaa cac tgc ttg aag ttt ttg aag aaa tgt tat 924 
His Asp Trp Ser Asp Glu His Cys Leu Lys Phe Leu Lys Lys Cys Tyr 
275 280 285 

gaa gca ctt cca acc aat ggg aag gtg ate ctt get gaa tgc ate etc 972 
Glu Ala Leu Pro Thr Asn Gly Lys Val He Leu Ala Glu Cys lie Leu 
290 295 300 

ccc gtg gcg cca gac gca age etc ccc act aag gca gtg gtc cat att 1020 
Pro Val Ala Pro Asp Ala Ser Leu Pro Thr Lys Ala Val Val His He 
305 310 315 

gat gtc ate atg ttg get cat aac cca ggt ggg aaa gag aga act gag 1068 
Asp Val He Met Leu Ala His Asn Pro Gly Gly Lys Glu Arg Thr Glu 
320 325 330 

aag gag ttt gag gcc ttg gcc aag ggg get gga ttt gaa ggt ttc cga 1116 
Lys Glu Phe Glu Ala Leu Ala Lys Gly Ala Gly Phe Glu Gly Phe Arg 
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335 340 345 350 

gta gta gcc teg tgc get tac aat aca tgg ate ate gaa ttt ttg aag 1164 

Val Val Ala Ser Cys Ala Tyr Asn Thr Trp lie He Glu Phe Leu Lys 

355 360 365 

aag att tgagtcctta cteggctttg agtacataat accaactcct tttggttttc 1220 
Lys lie 

gagattgtga ttgtgattgt gattgtctct etttegcagt tggccttatg atataatgta 1280 

tegttaaetc gatcacagaa gtgeaaaaga cagtgaatgt acactgettt ataaaataaa 1340 

aattttaaga ttttgattca tgtaaaaaaa aaaaaaaaaa 1380 



<210> 6 
<211> 368 
<212> PRT 

<213> Liquidambar styraciflua 
<400> 6 

Met Gly Ser Thr Ser Glu Thr Lys Met Ser Pro Ser Glu Ala Ala Ala 
15 10 15 

Ala Glu Glu Glu Ala Phe Val Phe Ala Met Gin Leu Thr Ser Ala Ser 
20 25 30 

Val Leu Pro Met Val Leu Lys Ser Ala He Glu Leu Asp Val Leu Glu 
35 40 45 

He Met Ala Lys Ala Gly Pro Gly Ala His He Ser Thr Ser Asp He 
50 55 60 

Ala Ser Lys Leu Pro Thr Lys Asn Pro Asp Ala Ala Val Met Leu Asp 
65 70 75 80 

Arg Met Leu Arg Leu Leu Ala Ser Tyr Ser Val Leu Thr Cys Ser Leu 
85 90 95 

Arg Thr Leu Pro Asp Gly Lys He Glu Arg Leu Tyr Gly Leu Ala Pro 
100 105 110 

Val Cys Lys Phe Leu Thr Arg Asn Asp Asp Gly Val Ser He Ala Ala 
115 120 125 

Leu Ser Leu Met Asn Gin Asp Lys Val Leu Met Glu Ser Trp Tyr His 
130 135 140 
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Leu Thr Glu Ala Val Leu Glu Gly Gly lie Pro Phe Asn Lys Ala Tyr 
145 150 155 160 

Gly Met Thr Ala Phe Glu Tyr His Gly Thr Asp Pro Arg. Phe Asn Thr 
165 170 175 

Val Phe Asn Asn Gly Met Ser Asn His Ser Thr lie Thr Met Lys Lys 
180 185 190 

lie Leu Glu Thr Tyr Lys Gly Phe Glu Gly Leu Gly Ser Val Val Asp 
195 200 205 

Val Gly Gly Gly Thr Gly Ala His Leu Asn Met He He Ala Lys Tyr 
210 215 220 

Pro Met He Lys Gly lie Asn Phe Asp Leu Pro His Val He Glu Glu 
225 230 235 240 

Ala Pro Ser Tyr Pro Gly Val Glu His Val Gly Gly Asp Met Phe Val 
245 250 255 

Ser Val Pro Lys Gly Asp Ala He Phe Met Lys Trp He Cys His Asp 
260 265 270 

Trp Ser Asp Glu His Cys Leu Lys Phe Leu Lys Lys Cys Tyr Glu Ala 
275 280 285 

Leu Pro Thr Asn Gly Lys Val He Leu Ala Glu Cys He Leu Pro Val 
290 295 300 

Ala Pro Asp Ala Ser Leu Pro Thr Lys Ala Val Val His He Asp Val 
3 °5 310 315 320 

He Met Leu Ala His Asn Pro Gly Gly Lys Glu Arg Thr Glu Lys Glu 
325 330 335 

Phe Glu Ala Leu Ala Lys Gly Ala Gly Phe Glu Gly Phe Arg Val Val 
340 345 350 



Ala Ser Cys Ala Tyr Asn Thr Trp He He Glu Phe Leu Lys Lys He 
355 360 365 



<210> 7 
<211> 2025 
<212> DNA 

<213> Liquidambar styraciflua 
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<220> 

<221> CDS 

<222> (60) . . (1679) 

<400> 7 

cggcacgagc tcattttcca cttctggttt gatctctgca attcttccat cagtcccta 59 



atg gag acc caa aca aaa caa gaa gaa ate ata tat egg teg aaa etc 
Met Glu Thr Gin Thr Lys Gin Glu Glu lie He Tyr Arg Ser Lys Leu 
5 10 15 



1 



gag 



aac ate tea cag ttc ggc tec cgc ccc tgt ctg ate aat ggc gca 
Glu Asn He Ser Gin Phe Gly Ser Arg Pro Cys Leu He Asn Gly Ala 
35 40 45 



gtc gca tec ggc etc aac aaa etc ggc gtt cga caa ggt gac ate ate 
Val Ala Ser Gly Leu Asn Lys Leu Gly Val Arg Gin Gly Asp He He 
65 70 75 80 



gca tec tac cgc ggg get gee gec acc gec gca aac ccg ttt tat acc 
Ala Ser Tyr Arg Gly Ala Ala Ala Thr Ala Ala Asn Pro Phe Tyr Thr 
100 105 110 



aac gtt gec aag ate ata tgt ata gac tea ccc ccg gac ggt tgt ttg 
Asn Val Ala Lys He He Cys He Asp Ser Pro Pro Asp Gly Cys Leu 
145 150 155 160 



107 



ccc gat ate tac ate ccc aaa cac etc cct tta cat teg tat tgt ttc 155 
Pro Asp He Tyr He Pro Lys His Leu Pro Leu His Ser Tyr Cys Phe 
20 25 30 



203 



acg ggc aag tat tac aca tat get gag gtt gag etc att gcg cgc aag 251 
Thr Gly Lys Tyr Tyr Thr Tyr Ala Glu Val Glu Leu He Ala Arg Lys 
50 55 60 



299 



atg ctt ttg eta ccc aac teg ccg gag ttc gtg ttt tea att etc ggc 347 
Met Leu Leu Leu Pro Asn Ser Pro Glu Phe Val Phe Ser He Leu Gly 
85 90 95 



395 



cct gee gag ate agg aag caa gee aaa acc tec aac gee agg ctt att 4 43 

Pro Ala Glu He Arg Lys Gin Ala Lys Thr Ser Asn Ala Arg Leu lie 
115 120 125 

ate aca cat gee tgt tac tat gag aaa gtg aag gac ttg gtg gaa gag 4 91 

He Thr His Ala Cys Tyr Tyr Glu Lys Val Lys Asp Leu Val Glu Glu 
130 135 140 



539 
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cac ttc teg gag ctg agt gag gcg gac gag aac gac atg ccc aat gta 
His Phe Ser Glu Leu Ser Glu Ala Asp Glu Asn Asp Met Pro Asn Val 
165 170 175 



587 



gag att gac ccc gat gat gtg gtg gcg ctg ccg tac teg tea ggg acg 635 
Glu lie Asp Pro Asp Asp Val Val Ala Leu Pro Tyr Ser Ser Gly Thr 
180 185 190 

acg ggt tta cca aag ggg gtg atg eta aca cac aag gga caa gtg acg 683 
Thr Gly Leu Pro Lys Gly Val Met Leu Thr His Lys Gly Gin Val Thr 
195 200 205 

agt gtg gcg caa cag gtg gac gga gag aat ccg aac ctg tat ata cat 731 
Ser Val Ala Gin Gin Val Asp Gly Glu Asn Pro Asn Leu Tyr lie His 
210 215 220 

age gag gac gtg gtt ctg tgc gtg ttg cct ctg ttt cac ate tac teg 779 
Ser Glu Asp Val Val Leu Cys Val Leu Pro Leu Phe His He Tyr Ser 
225 230 235 240 

atg aac gtc atg ttt tgc ggg tta cga gtt ggt gcg gcg att ctg att 827 
Met Asn Val Met Phe Cys Gly Leu Arg Val Gly Ala Ala He Leu He 
245 250 255 

atg cag aaa ttt gaa ata tat ggg ttg tta gag ctg gtc aga agt aca 875 
Met Gin Lys Phe Glu He Tyr Gly Leu Leu Glu Leu Val Arg Ser Thr 
260 265 270 

ggt gac cat cat gec tat cgt aca ccc ate gta ttg gca ate tec aag 923 
Gly Asp His His Ala Tyr Arg Thr Pro He Val Leu Ala He Ser Lys 
275 280 285 

act ccg gat ctt cac aac tat gat gtg tec tec att egg act gtc atg 971 
Thr Pro Asp Lea His Asn Tyr Asp Val Ser Ser He Arg Thr Val Met 
290 295 300 

tea ggt gcg get cct ctg ggc aag gaa ctt gaa gat tct gtc aga get 1019 
Ser Gly Ala Ala Pro Leu Gly Lys Glu Leu Glu Asp Ser Val Arg Ala 
305 310 315 320 

aag ttt ccc acc gec aaa ctt ggt cag gga tat gga atg acg gag gca 1067 
Lys Phe Pro Thr Ala Lys Leu Gly Gin Gly Tyr Gly Met Thr Glu Ala 
325 330 335 

ggg ccc gtg eta gcg atg tgt ttg gca ttt gee aag gaa ggg ttt gaa 1115 
Gly Pro Val Leu Ala Met Cys Leu Ala Phe Ala Lys Glu Gly Phe Glu 
340 345 350 
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ata aaa teg ggg gca tct gga act gtt tta agg aac gca cag atg aag 
He Lys Ser Gly Ala Ser Gly Thr Val Leu Arg Asn Ala Gin Met Lys 
35S 360 365 

att gtg gac cct gaa acc ggt gtc act etc cct cga aac caa ccc gga 
He Val Asp Pro Glu Thr Gly Val Thr Leu Pro Arg Asn Gin Pro Gly 
370 375 380 

gag att tgc att aga gga gac caa ate atg aaa ggt tat ctt aat gat 
Glu He Cys He Arg Gly Asp Gin He Met Lys Gly Tyr Leu Asn Asp 
385 390 395 400 



egg ttg aag gaa ctg ate aaa tac aaa ggg ttt cag gtg gca ccc get 

Arg Leu Lys Glu Leu He Lys Tyr Lys Gly Phe Gin Val Ala Pro Ala 

435 440 445 

gag ctt gag gec atg etc etc aac cat ccc aac ate tct gat get gec 

Glu Leu Glu Ala Met Leu Leu Asn His Pro Asn He Ser Asp Ala Ala 

450 455 460 

gtc gtc cca atg aaa gac gat gaa get gga gag etc cct gtg gcg ttt 

Val Val Pro Met Lys Asp Asp Glu Ala Gly Glu Leu Pro Val Ala Phe 

465 470 475 480 



gac ctg aga gec aaa ttg gcg tct ggt ctt ccc aat taattctcat 
Asp Leu Arg Ala Lys Leu Ala Ser Gly Leu Pro Asn 
530 535 540 
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1211 



1259 



cct gag gcg acg gag aga acc ata gac aag gaa ggt tgg tta cac aca 1307 
Pro Glu Ala Thr Glu Arg Thr He Asp Lys Glu Gly Trp Leu His Thr 
405 410 415 

ggt gat gtg ggc tac ate gac gat gac act gag etc ttc att gtt gat 1355 
Gly Asp Val Gly Tyr lie Asp Asp Asp Thr Glu Leu Phe He Val Asp 
420 425 430 



1403 



1451 



1499 



gtt gta aga tea gat ggt tct cag ata tec gag get gaa ate agg caa 1547 
Val Val Arg Ser Asp Gly Ser Gin He Ser Glu Ala Glu He Arg Gin 
485 490 495 

tac ate gca aaa cag gtg gtt ttt tat aaa aga ata cat cgc gta ttt 1595 
Tyr He Ala Lys Gin Val Val Phe Tyr Lys Arg He His Arg Val Phe 
500 505 510 

ttc gtc gaa gec att cct aaa gcg ccc tct ggc aaa ate ttg egg aag 164 3 
Phe Val Glu Ala He Pro Lys Ala Pro Ser Gly Lys He Leu Arg Lys 
515 520 525 
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tcgctaccct cctttctctt atcatacgcc aacacgaacg aagaggctca attaaacgct 174 9 

gctcattcga agcggctcaa ttaaagctgc tcattcatgt ccaccgagtg ggcagcctgt 1809 

cttgttggga tgttctttca tttgattcag ctgtgagaag ccagaccctc attatttatt 1869 

gtgaaattca caagaatgtc tgtaaatcga tgttgtgagt gatgggtttc aaaacacttt 1929 

tgacattgtt tacgttgtat ttcctgctgt tgaaaataac tactttgtat gacttttatt 1989 

tgggaagata acctttcaaa aaaaaaaaaa aaaaaa 2025 



<210> 8 
<211> 540 - 
<212> PRT 

<213> Liquidambar styraciflua 
<400> 8 

Met Glu Thr Gin Thr Lys Gin Glu .Glu lie He Tyr Arg Ser Lys Leu 
1 5 10 15 

Pro Asp He Tyr He Pro Lys His Leu Pro Leu His Ser Tyr Cys Phe 
20 25 30 

Glu Asn He Ser Gin Phe Gly Ser Arg Pro Cys Leu He Asn Gly Ala 
35 40 45 

Thr Gly Lys Tyr Tyr Thr Tyr Ala Glu Val Glu Leu He Ala Arg Lys 
50 55 60 

Val Ala Ser Gly Leu Asn Lys Leu Gly Val Arg Gin Gly Asp He He 
65 70 75 80 

Met Leu Leu Leu Pro Asn Ser Pro Glu Phe Val Phe Ser He Leu Gly 
85 90 95 

Ala Ser Tyr Arg Gly Ala Ala Ala Thr Ala Ala Asn Pro Phe Tyr Thr 
100 105 110 

Pro Ala Glu He Arg Lys Gin Ala Lys Thr Ser Asn Ala Arg Leu He 
115 120 125 

He Thr His Ala Cys Tyr Tyr Glu Lys Val Lys Asp Leu Val Glu Glu 
130 135 140 

Asn Val Ala Lys He He Cys He Asp Ser Pro Pro Asp Gly Cys Leu 
14 5 150 155 160 
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His Phe Ser Glu Leu Ser Glu Ala Asp Glu Asn Asp Met Pro Asn Val 
165 170 175 

Glu lie Asp Pro Asp Asp Val Val Ala Leu Pro Tyr Ser Ser Gly Thr 
180 185 190 

Thr Gly Leu Pro Lys Gly Val Met Leu Thr His Lys Gly Gin Val Thr 
195 200 205 

Ser Val Ala Gin Gin Val Asp Gly Glu Asn Pro Asn Leu Tyr lie His 
210 215 220 

Ser Glu Asp Val Val Leu Cys Val Leu Pro Leu Phe His lie Tyr Ser 
225 230 235 240 

Met Asn Val Met Phe Cys Gly Leu Arg Val Gly Ala Ala lie Leu lie 
245 250 255 

Met Gin Lys Phe Glu lie Tyr Gly Leu Leu Glu Leu Val Arg Ser Thr 
260 265 270 

Gly Asp His His Ala Tyr Arg Thr Pro lie Val Leu Ala lie Ser Lys 
275 280 285 

Thr Pro Asp Leu His Asn Tyr Asp Val Ser Ser lie Arg Thr Val Met 
290 295 300 

Ser Gly Ala Ala Pro Leu Gly Lys Glu Leu Glu Asp Ser Val Arg Ala 
305 310 315 320 

Lys Phe Pro Thr Ala Lys Leu Gly Gin Gly Tyr Gly Met Thr Glu Ala 
325 330 335 

Gly Pro Val Leu Ala Met Cys Leu Ala Phe Ala Lys Glu Gly Phe Glu 
340 345 350 

lie Lys Ser Gly Ala Ser Gly Thr Val Leu Arg Asn Ala Gin Met Lys 
355 360 365 

He Val Asp Pro Glu Thr Gly Val Thr Leu Pro Arg Asn Gin Pro Gly 
370 375 380 

Glu He Cys He Arg Gly Asp Gin He Met Lys Gly Tyr Leu Asn Asp 
385 390 395 400 

Pro Glu Ala Thr Glu Arg Thr He Asp Lys Glu Gly Trp Leu His Thr 
405 410 415 
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Gly Asp Val Gly Tyr He Asp Asp Asp Thr Glu Leu Phe He Val Asp 
420 425 430 

Arg Leu Lys Glu Leu He Lys Tyr Lys Gly Phe Gin Val Ala Pro Ala 
435 440 445 

Glu Leu Glu Ala Met Leu Leu Asn His Pro Asn He Ser Asp Ala Ala 
450 455 460 

Val Val Pro Met Lys Asp Asp Glu Ala Gly Glu Leu Pro Val Ala Phe 
465 470 475 480 

Val Val Arg Ser Asp Gly Ser Gin He Ser Glu Ala Glu He Arg Gin 
485 490 495 

Tyr lie Ala Lys Gin Val Val Phe Tyr Lys Arg He His Arg Val Phe 
500 505 510 

Phe Val Glu Ala He Pro Lys Ala Pro Ser Gly Lys He Leu Arg Lys 
515 520 525 

Asp Leu Arg Ala Lys Leu Ala Ser Gly Leu Pro Asn 
530 535 540 



<210> 9 
<211> 1544 
<212> DNA 
<213> Pinus taeda 

<400> 9 



aaagataata 


tatgtgtatg 


cctactacta 


cacattgttt 


tgaagtgtgt 


aaacatagtg 


60 


caacactagg 


aggactcaca 


atgagcactt 


gttgacatga 


aactagctaa 


atgcccaaca 


120 


atattagtga 


aagctagtta 


aactaacccc 


tttgactttc 


aagatgatat 


atttatatcc 


180 


ctactacgtc 


ttcctctttt 


tgtctttctc 


ttgtgattaa 


accttccttg 


aaacaattct 


240 


caaatgtaaa 


attaaacctt 


gaaacttgta gagaccaaac ttccctagga 


gaaaccacat 


300 


ttatgacaac 


atatatacac 


caacceattg 


catactataa 


tattggaatt 


acctgcagcg 


360 


aacgaaagaa 


acgctgtctc 


accaactcgt 


gcactacatc 


ccgaaactta 


accttcccct 


420 


gatacagatt gaa'gagccga 


aaaaagcgtg 


catccaaatt 


tctggtatgg 


tgaggagccg 


480 
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aaaaacgcgt gcgcctaatt tttttgagat gggccggaaa ataatgcgtg catctaaatt 540 
ttcacgtgtc gcgtattggc gaggttgcgc tgaatgtgat cctgtgcgtg agccacattc 600 
attccattgg ttgacccgcc ggtaccgcga ggaccgtggg gtctcacaga tacgcggatg 660 
gtggatcagc actgagaaga ttagatgatg accaggcggg catttgaagt aaaaacttgg 720 
gggtggttgg caagtacgcg acaaagaggg gtagtgcgca aggaagcgag ttggatgcaa 780 
ataatattac aaagtgggtt ggtgggcatg agcatcaacc agaatgatgt tgttgctggt 840 
tccgtgcaaa ttctgaccag tagtttgaac aatactaccc aacttgtttt tggtaaaaca 900 
tgaagtgggt aaggagaatt gaacttacgt ctcatggtaa agggcaaggg caaatgactt 960 
aacacatacc tttaactaat aaaaataccc ctaacaaata cgaaaacgaa tgagttatca 1020 
cagaccttca actaataaga tagccatcag acccacatct cctgactgac caaaaacaaa 1080 
tgacttcaac caactaagat acccatcaaa gctaacccac aacccaattc ctcacttccc 1140 
cttaccagac caaccaagca gacctacgcc attaactact ttaggacgtg ggaattgggg 1200 
gtgccaccgt tgaagaatgg cactcagggt tggtaatccc tccacgtgta tgtagcagtc 1260 
gtttggtgga gacggcgtgt ttgaatgtcc accttccagt ttggagaaca aggaaattgg 1320 
gcttatatta ggcctggatc tcttgtttca gagcaggagt agttcaggac aggaactagc 1380 
attcaagaat tcaattgccc tgccctgctc tgctctgctt tgctcaactt attgatccct 14 4 0 
gctctggttt gttcaatttc ttgacccctg ctgggttctg ctctggtttg cacactttct 1500 
cgattatata agtcattttg gatccttgca aggaagagaa tatg 154 4 

<210> 10 

<211> 659 

<212> DMA 

<213> Pinus taeda 

<400> 10 

aaacaccaat ttaatgggat ttcagatttg tatcccatgc tattggctaa ggcatttttc 60 
ttattgtaat ctaaccaatt ctaatttcca ccctggtgtg aactgactga caaatgcggt 120 
ccgaaaacag cgaatgaaat gtctgggtga tcggtcaaac aagcggtggg cgagagagcg 180 
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cgggtgttgg cctagccggg 

aatggagttt tcggggtagg 

caaaaatcca accgctcctt 

cactcaatcg atcgcctgcc 

accaacaatt ccaggccggc 

agccggcctc tgcttccttc 

tacatttgtc agacacgttt 

gttcggattg ggattgaatc 

<210> 11 
<211> 2251 
<212> DNA 
<213> Pinus taeda 

<400> 11 

ggccgggtgg tgacatttat 
aaagaaaaca aaattttcat 
aaacccttaa tataaagaat 
caacctcctc caacaaaatt 
aaaatattat acaaaattta 
tcaaaact.tt aaaataaagc 
cacaaaaatt ttgaaaacat 
gtctcattaa ctcattagtt 
aataagggtg ttttaataag 
atggagtttt taaaaatata 
tggaaaaggt tggtaagaac 
agatgttaaa tttatatatg 



atgggggtag gtagacggcg 
tagtaacgta gacgtcaatg 
cacatcgcag agttggtggc 
gtggttgccc attattcaac 
tttctataca atgtactgca 
tcagtagccc ccagctcatt 
tccgccattt ttcgcctgtt 
aattgaaagg tttttatttt 



tcataaattc atctcaaaac 
ctttaacata attataattg 
ttctttcaac aatacacttt 
aaaatagatt aataaataaa 
ttaaaacttc aaaataaaca 
taaacactga aaatgtgagt 
aaacaaactt gaaactctac 
ttatagttcg aatccaatta 
tgattttggg atttttttag 
tatatatata tatatttttg 
tataaattga gttgtgaatg 
taattaaaat tttattttga 
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tattaccggc gagttgtccg 240 
gaaaaagtca taatctccgt 300 
cacgggaccc tccacccact 360 
catacgccac ttgactcttc 420 
caggaaaatc caatataaaa 480 
caattcttcc cactgcaggc 54 0 
tctgcggaga atttgatcag 600 
cagtatttcg atcgccatg 659 



aagaaggatt tacaaaaata 60 
tgttcacaaa attcaaactt 120 
aatcacaact tcttcaatca 180 
taaacttaac tatttaaaaa 240 
aactttttat acaaaattca 300 
acatttaaaa ggacgctgat 360 
cttttaagaa tgagtttgtc 420 
acgtatcttt tattttatgg 480 
taatttattt gtgatatgtt 540 
ggttgagttt acttaaaatt 600 
agtgttttat ggatttttta 660 
ataacaaaaa ttataattgg 720 



WO 99731243 



PCT/US98/26784 



ataaaaaatt gttttgttaa atttagagta 
tattattttt aaaaaatttg ttggtaaatt 
aaattaattt taaattaata aacttttgaa 
taaatctatt ttgcattcaa aatacaattt 
caatttgtat aaaaaccaaa aatctcaaat 
tatgatttca agaaagacaa taaccagttt 
attaagatct cattaattaa ttcttatttt 
attgtatcca agaaatatag aatgttctcg 
caaaatcatt acattaaagc tcatcatgtc 
agaaatatag aatgttctcg tctagggact 
acattaaagc tcatcatgtc atttgtggat 
ttctctcaat ctcccaaaat atagttcgaa 
tacccaataa tatatttttt tatacatttt 
tttattggaa tgaaggttga gttataaact 
agatactaaa tccattatat aataaaaaca 
gatttgtatc ccatgctatt ggctaaggca 
tttccaccct ggtgtgaact gactgacaaa 
gggtgatcgg tcaaacaagc ggtgggcgag 
gggtaggtag acggcgtatt accggcgagt 
aacgtagacg tcaatggaaa aagtcataat 
tcgcagagtt ggtggccacg ggaccctcca 
ccattattca accatacgcc acttgactct 
caatgtactg cacaggaaaa tccaatataa 
ccccagctca ttcaattctt cccactgcag 

24 



aaaatttcaa aatctaaaat aattaaacac 780 
ttatcttata tttaagttaa aatttagaaa 840 
gtcaaatatt ccaaatattt tccaaaatat 900 
aaataataaa acttcatgga atagattaac 960 
aaaatttaaa ttacaaaaca ttatcaacat 1020 
ccaataaaat aaaaaacctc atggcccgta 1080 
ttaatttttt tacatagaaa atatctttat 1140 
tccagggact attaatctcc aaacaagttt 1200 
atttgtggat tggaaattat attgtataag 1260 
attaatttcc aaacaaattt caaaatcatt 1320 
tggaaattag acaaaaaaaa tcccaaatat 1380 
ctccatattt ttggaaattg agaatttttt 14 40 
agagattttc cagacatatt tgctctggga 1500 
ttcagtaatc caagtatctt cggtttttga 1560 
cattttaaac accaatttaa tgggatttca 1620 
tttttcttat tgtaatctaa ccaattctaa 1680 
tgcggtccga aaacagcgaa tgaaatgtct 174 0 
agagcgcggg tgttggccta gccgggatgg 1800 
tgtccgaatg gagttttcgg ggtaggtagt 1860 
ctccgtcaaa aatccaaccg ctccttcaca 1920 
cccactcact cgatcgcctg ccgtggttgc 19B0 
tcaccaacaa ttccaggccg gctttctata 2040 
aaagccggcc tctgcttcct tctcagtagc 2100 
gctacatttg tcagacacgt tttccgccat 2160 
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ttttcgcctg tttctgcgga gaatttgatc aggttcggat tgggattgaa tcaattgaaa 2220 
ggtttttatt ttcagtattt cgatcgccat g 225J 
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